Soft signals on commitment to package maintenance

proposal

#1

Motivation

Julia has been around for 6 years now, and for the last 2–3 years the number of packages has increased rapidly. As it is inevitable with free software and single-person projects, some packages have received less maintenance after an initial period of activity, and quite a few packages appear to be abandoned.

I want to emphasize that I think this is normal, and people are under no obligation to contribute their time and effort to projects they are no longer interested in. At the same time, encountering code snippets or earlier discussion about a now-abandoned package can be confusing and frustrating to new users.

I don’t know what the right solution is, but I wanted to start some discussion about this. This is not a plan to formalize or enforce (haha) anything, the intention is to ensure a common understanding.

Objective measures of activity

Most Julia packages live in a git repository, usually on Github or similar. The main page shows the date of the last activity on the code, which can be informative; similarly one can look at the date of discussions in the issues.

Lack of recent activity per se is not a signal of abandonment, there may be just no issues. New releases usually require some maintenance though, so I would say that a package should be considered potentially abandonned if there has been no activity

  1. 3 months after the last stable release,
  2. for the last 2 years.

Signalling commitment to maintenance

This is about informing potential users about the maintainer’s intentions about the package, and is somewhat difficult to do since there are so many variables. For example, one may intend to support some package but currently lack the time because of other obligations or major life events.

That said, I find the repostatus badges quite useful, and the distinction between Suspended, Abandonned, Inactive, Unsupported fine-grained enough.

I would consider a repo Active if

  1. it is updated to work correctly within 2 months of the latest release (ideally, package maintaners would make sure things work on alpha before the release, but dependency problems can delay this),
  2. issues get some response within 1 month,
  3. PRs get some response within 1-3 months (depending on complexity).

This does not mean that issues and PRs are fixed/merged/closed within this timeframe, just that they get some discussion.

For repos which are not Active, it is a courtesy to change their status to something else, eg Suspended or Inactive. Also, repos should consider not starting out as Active, rather WIP, and switch to Active after consistent maintenance for 1 year.

The above should not be construed as justification to badger package maintaners about unfixed issues etc. But once a package appears to be inactive for a while, I think an issue asking for clarification about its status is OK. Similarly, I think it is reasonable to wait 2-4 weeks with a “friendly bump” if some issue has not received attention.


#2

Although I really support what you are saying, and I think there should be some kind of indication of whether projects are abandoned or not, I think the repostatus badges are totally “useless”.

If someone who abandoned a repo cared, they could write it on the top line of the readme. If they wouldn’t do that, I don’t see why they would bother doing the repo status, which takes more effort.


#3

It is a generally nice idea.
Some scattered thoughts:


I dislike the name Inactive because it is too negative sounding for that state.
Like there is a state where a package reaches a happy place.
It has no known bugs and it does what it set out to do.
It is “Inactive”, but it may actually be better maintained than something that is active.


I guess another thing might be activeness does not correspond to supportedness.
I guess to use an example of one of my packages Pipe.jl has had only a handful of commits since 0.3 days. It is a simple package that does a simple idea. It is done. It works on 4 versions of julia and on every platform.
But if someone found a bug in it, I can honestly say that I’ld probably have it fixed within 72 hours (that is in part because any bug in it would have to be simple to fix, because it is so simple.).
Conversely, TensorFlow.jl is probably “Active”, I’ve tagged several releases in the last few months.
But if you raise an issue their, I’ll likely reply in 72 hours.
But fixing it might not happen for weeks, because TensorFlow tends to have difficult to handle bugs.
And I can only really support it on linux.


I kind of think there are at least 2 axis.

  • Completeness/stability, which is kinda covered by SemVer, if people were actually tagging a 1.0+ when no longer WIP.
  • Maintenance/Activity.

I think also there is some kind of deprecated status.
Which is different from moved or abandonned (some deprecated libraries might even be maintained to some extent. idk though.)

For example Zlib.jl is deprecated in favour of Libz.jl which is deprecated in favor of CodecZlib.jl
And deomplete-julia is deprecated in favour of LanguageServer.jl.


#4

You are right; what I propose requires some minimal cooperation from package authors. It implicitly assumes that there are people who are willing to make a bit of effort in order to benefit others, just may be unaware that simply declaring the status of a package is helpful, even when abandoned.

I have no proposal about packages for which maintainers are completely unresponsive or unreachable.

Also, while I find the repostatus badges useful, I think that other forms of signalling about the status are totally fine, eg in the README as you suggested.


#5

I agree that this is indeed possible (I have some Common Lisp libraries I have not touched for a while, but are still popular), but this requires a stable language. With Julia so far, not updating after a release has usually left a package broken/unusable on subsequent releases (unless the functionality is really simple). This of course may change after 1.0, and you are right that we should allow for that.


#6

Something I’d like to see is a “still alive and mainained” notice once a year, or even every 6 months (just have a “last review of issues was after date xyz” entry in the news / readme that you regularly update). Like a TCP keepalive.

Julia is not mature yet. You’d expect a relatively small C project to be still good after 10 years without commits, but I think this is an unrealistic goal for julia 1.0. And for larger C projects, I’d also be very skeptic about projects that have not seen commits for 5 years.


#7

Perhaps this could even be automated: packages that have no activity (either on the code or a notice of this kind) for a specified period are notified by a package registry bot, and if there is no response for a while they are gracefully removed.


#8

@Tamas_Papp In BioJulia, we had some threads questioning the status of our packages.
As a result I’ve been rolling out the following to every package repo we maintain:

A HUMANS.md file containing the names and emails of the specific BioJulia members dedicated to leading the maintainership of the package, and in the README.md files I’ve been rolling out lifecycle badges as they recently started using in the R tidyverse: https://www.tidyverse.org/lifecycle/

This should really help people know not only if a BioJulia package is maintained, but also if it’s experimental, maturing, considered ready or abandoned and so on. Currently pacages are either, experimental, maturing or retired, with julia 1.0 most of the maturing packages will be made “active”.

So the above is a potential solution I’m trying now for BioJulia, it might be useful for others too.


#9

@Ward9250 nice.

The LifeCycle badges (https://www.tidyverse.org/lifecycle/)
work a lot better for me than the
repostatus badges (http://www.repostatus.org/)
badges


#10

BioJulia is extremely organized, the contribution guidelines are a very nice template for other projects, too.

I did not know about the lifecycle badges, but they may be more suitable than the repostatus ones.

I saw the links in the document to the badges, but I am curious who is maintaining these versions, tidyverse, BioJulia, or a third party?


#11

Tidyverse came up with the badges, but since they are just generated with shields.io I think as long as we have them listed and explained in our CONTRIBUTING.md we don’t need to worry about who maintains them, if we decide to change the badges or add or remove some we can just alter our files and let our badges diverge from the tidyverse ones.


#12

Thanks, I was confused because the rendered page takes images from camo.githubusercontent.com, but looking at the sources I can see that is just caching.


#13

As the maintainer of three packages that haven’t reasonably needed an update since 0.5, this seems like adding work for little benefit. Very few people use OAuth.jl, Twitter.jl and NoveltyColors.jl, but there are also no known issues. So I’d be certifying to myself that my own packages work.


#14

I am a bit confused, eg OAuth.jl had 10+ commits in the past year, so by any reasonable rule it would be classified as “active” and not be affected. Same applies to Twitter.jl.

NoveltyColors.jl is a special case since (if I understand correctly) it has mostly data. Still, there are other packages like this, so this is a good point.


#15

My estimates which I should publish soon are:

  • Initial analysis
    • DEPRECATED: 38
    • DEVELOPMENT: 62
    • OK: 1,512
    • UNMAINTAINED: 428

#16

One other thing that could help would be to at some point resuscitate PackageEvaluator so that one can know whether the package still passes tests despite not having been updated for a long time. However it seems like it’s quite tricky to keep PackageEvaluator working, I wonder whether there can be alternative ways to test that the build passes every few days on all packages (maybe packages can somehow set up a “scheduled build” on Travis ? I really don’t know much about these infrastructure setups).


#17

I think the goal of this thread is quite worthy, I often try evaluate packages in the way you describe. One indicator that seems automatable and of possibly useful: the number of packages with passing tests that depend on this package. But it may be that packages that are often depended upon have so many other indicators of quality that this is redundant.


#18

Related: Preliminary Analysis of Package Status