Bounds on package versions

Given the discussion on DataFrames.jl tagging/dependencies cycle we recently had on Slack I have the following general question.

Currently Pkg manager is strict with respect to ensuring consistency of versions of dependencies. This is great for production. Also I think it is great for core packages that have a guaranteed long term active maintenance.

However, I see a potential problem of the following nature (this has probably been discussed - if yes could someone just point me to the discussion - thank you; I could not find it unfortunately). As the package ecosystem grows and matures with every year we are more and more likely to have the following situation:

we have some great package A that has a dependency B. Package A has not been maintained (e.g. it does what it should in a good way, no changes are needed, the creator focuses efforts on some other things). But it depends on a certain version of package B that after some time got outdated and has some breaking change, so that the current version of B is not supported by A.

Currently if you need - due to some other packages - to have B in its latest version you are not allowed to have A installed at the same time (package manager will report an error trying to install it). I feel (but maybe I am wrong - in such a situation please correct me) - that in the long run this situation might become problematic as such cases will become more and more frequent (I am talking here about 5 to 10 years from now).

What are the possible solutions (I know their drawbacks, but I feel that we could discuss what to do with this situation):

  • allow package A to load internally B in another version than B has in the repository (so essentially we would have two versions of B - one used internally by A, other - available for the user; in the case there is a version clash - e.g. an object from global B is passed to local B in A - we error; something like this is already possible with modules - each module can have a submodule that has the same name, but different implementation)
  • allow to install A anyway (e.g. some force parameter to add) and warn user that the user does it at ones own risk that A could not work correctly with latest version of B and it is not possible to find a feasible combination of versions that meet all the version restrictions

So my question to the community is:

  • maybe I grossly do not understand something and we do not have this problem (it would be great then!)
  • if we have this problem - then do you think any of the solutions proposed above would be acceptable

Thank you for any comments on this.

4 Likes

This isn’t a question for 5-10 years from now; lots of us have been dealing with this for months and there are a ton of discussions scattered all over the package ecosystem. There are two strategies:

  • use different Project environments. In an environment where you don’t need A you can use the latest version of B.
  • release a new version of A that supports the latest B. This might be as simple as bumping the versions in the [compat] section, but you should always test that it actually works before inflicting it on users.

For mitigating this problem: that’s basically what CompatHelper is for. And lots of us want something like RFC: Warn if `] up Foo` is not able to install the latest version of `Foo` by DilumAluthge · Pull Request #1606 · JuliaLang/Pkg.jl · GitHub.

4 Likes

release a new version of A that supports the latest B.

This might require forking a project. Can a new version be released from a different GitHub project without problem?

1 Like

Thank you for the response. I understand the approach you describe (I use it on a daily basis), but there are the following problems in the situation I am talking about:

use different Project environments.

If I understand the standard Julia workflow correctly for every project one uses a separate project environment.
I am talking about the case when you need A and B in the same project environment.

release a new version of A that supports the latest B .

This is a package maintainer’s perspective and this is what of course should be done. However, I am talking about user’s perspective. In this case you cannot reasonably expect that

you should always test that it actually works before inflicting it on users.

is possible as in order to do it you actually contribute to a public repository and get your PR approved, which is a very significant blocker. Also such a process takes time and usually if I need packages A and B I am willing to spend 5 minutes not 5 days to install them.

I am writing about 5 to 10 years time exactly because in 5 to 10 years time we should assume that for many good packages there will be no one who is even able to review such PRs without a significant effort.

Therefore apart from the things you have mentioned in your post (which I think are adequate for actively maintained packages) I feel that we need a separate solution for the “long tail” of the package ecosystem that can be applied by an end user locally without interacting with packages themselves.

In particular - up to my understanding the first solution (allow each package to use a different version of a dependency if it needs to) should be possible. The problems we can have is when a package does type piracy, but such a situation also possibly could be automatically detected and disallowed (this is probably not super easy but maybe this is doable).

Also note that I am not writing it to complain, but to discuss a long term blocker for “normal users” of using Julia package ecosystem.

2 Likes

Sorry, I missed the word “not” in your description of A’s maintenance. I should have read more carefully.

If the maintenance of A is really problematic, the best option is to give it another maintainer. There should be a grace period, but it should be made clear to the original owner that the project will be forked if s/he remains unresponsive. Often one can arrange an agreeable transfer to an organization. Once the transfer is complete, submit a PR for an updated URL for the registries/General/S/SomePkg/Package.toml file.

Another option is to sidestep the typical release process and bump the versions in registries/General/S/SomePkg/Compat.toml manually.

Another option is to sidestep the typical release process and bump the versions

And I just did such a PR yesterday (Additional bounds on dependencies on old DataFrames.jl releases by bkamins · Pull Request #10472 · JuliaRegistries/General · GitHub). Realistically even package maintainers are not able to do it with 100% confidence that they do the right thing.

If the maintenance of A is really problematic, the best option is to give it another maintainer.

Again a practical example. No one currently maintains GitHub - JuliaData/DataFramesMeta.jl: Metaprogramming tools for DataFrames, even though I have asked several times if there are people willing to do so.


But given your answers I understand that you do not see a possibility of any solutions of the kind I describe as half measures to solve the problem fast (like in several minutes) if some packages have conflicting dependencies. Is this a correct understanding?

Realistically even package maintainers are not able to do it with 100% confidence that they do the right thing.

Yes. We’d need tooling to make it less error prone if it becomes common.

I have asked several times if there are people willing to do so.

If you’re willing to maintain it at least to the point of bumping dependency versions, then I guess it’s a question of getting someone to give you permission. If you’re not (which would be understandable given your other commitments), then it’s a question of finding someone who has time. There may not be anyone at the moment, in which case you’re back to asking whether you can at least do the minimum.

you do not see a possibility of any solutions of the kind I describe as half measures

I worry that they are much harder than the alternatives. Fundamentally this is a social issue, not a technical one.

1 Like

I was just thinking about this yesterday when merging some retroactively added upper bounds into the registry. When we introduced required upper bounds for packages some people were really upset about it (even thought it was just optional!) (see Please be mindful of version bounds and semantic versioning when tagging your packages). Since then there have been mostly silence, and AFAIK this has not been a problem in practice. Most “complaints” is rather about old versions that have no bounds and claim compatibility with everything and we have to retroactively put constraints on them.

The main problem right now seems to be that people release breaking (as in Pkg consider them breaking) versions without any breaking code. This requires maintainers of packages that depend on said package to bump their compat, bump the patch version and re-release, which is not that difficult, takes about 10 seconds (plus maybe running tests to verify that the non-breaking breaking version was not actually non-breaking).

If all maintainers of a package goes MIA it can always be solved in other ways as Tim says.

I agree this is a social issue, but I just feared that we will face it in the future (just imagine @quinnj stop maintaining CSV.jl :smile:).

The main problem right now seems to be that people release breaking (as in Pkg consider them breaking) versions without any breaking code.

This is not that simple. You should tag a breaking release if you break “something”, even if you are breaking something relatively minor. Of course you should avoid doing such things and wait till it is really worth to tag a breaking release as you have accumulated many breaking changes, but in practice I do not see it often (on GitHub you would have to maintain two branches - e.g. master with “non breaking” changes - and release it more often - and another also with breaking ones and merge them to master only from time to time; this is of course doable, but the workflow is simply harder than the default GitHub workflow).


In summary - thank you for commenting. The thing that is crucial for me is:

I worry that they are much harder than the alternatives

So if this was investigated and no simple half-measure is good we have to live with this. Thank you!

A similar solution is possible if we have:

and then package B updates the UUID for each increment of the major version. This is equivalent to what Go is doing. I think this is a good direction for avoiding fragmentation of the ecosystem.

1 Like

Thank you for bringing up this problem. I agree with those who consider it mainly a social one, but it would be great if we could develop more or less standard ways of dealing with it.

First, I would recommend that we set realistic goals. Nontrivial software written in a language and depending on an ecosystem which both evolve rapidly will inevitably rot unless continuously maintained.

The fact that one can make something work in some environment (with older versions of packages) is itself a pretty amazing feature of languages with modern package managers like Julia. This should allow interested users to evaluate a package, and decide if it is worthy of further investment. Then, if it makes sense, they should be able to fork unmaintained projects and take over gracefully even in the event of the original maintainers not responding.

This is currently possible, especially if one does not insist on continuing with the package name. That said, maybe unmaintained package names should be retired from the registry and recycled after a while, so that we don’t end up with SeventhIncarnationOfSomePackage.jl.

Socially, I think we should be prepared to accept forks as inevitable and encourage them. The registry and the package docs should have a tutorial-like section on forking packages (what happens, how you should do it properly).

FWIW, I think it is pretty rare for nontrivial software to work well after years of neglect. At least not in a verifiable way (ie we don’t know about major bugs because no one is using it and CI stopped running a while ago). In my experience, issues just keep piling up, then at some point people lose interest.

That’s a reasonable interim solution, but occasionally only for the short run. There are quite a few packages which were transferred to an organization, and get minor PRs merged occasionally, but occasional major refactoring (not only to improve the package per se, but to mesh with another one better) is usually not feasible without a dedicated maintainer.

We should also keep in mind that a lot of Julia software is experimental. Not everyone feels they should invest in a lot in CI and documentation, at least initially. Occasionally, even if an old package “works”, it is easier to write a new one than to fork and refactor.