Forward compatibility and stability of Julia vs. Packages

Totally agree that it’s not common, but:

It depends!
Take a scenario where you have a project confirmed working, with a set of versions: Julia 1.x.y, PackageA 0.x.y, PackageB 1.x.y, … . And you want to update one of these versions to something newer, keeping everything else exactly as they were.
This tends to be trivial with packages, but often not possible with Julia - at least some updates from 1.x to 1.(x+1) break lots of popular packages.

Examples? The only cases I know about are packages that use/manipulate Julia internals: things like JuliaInterpreter, SnoopCompile, and Cassette. It’s expected that anything manipulating Julia’s internal representations (which are not subject to the stability guarantee) is breakable, but packages written to Julia’s public interface should not have this happen. If it’s happening to lots of things, maybe people are a wee bit too excited about metaprogramming arcana? :slight_smile:

It’s not like we don’t do actual work to maintain compatibility: we test each Julia release candidate against the entire registered package ecosystem. By comparison, very little in the package ecosystem (with some notable exceptions, of course) goes through the same rigorous testing before release.

8 Likes

Aside from those, popular packages shouldn’t be breaking.

Meanwhile, in the open source ecosystem, breakages are routine.
For example, today we have a new patch version of StatsBse which breaks KernelDensity.jl’s precompilation, which has 271 open source dependents that now all fail to precompile and load.

This is a regular occurrence among packages, but they can thankfully be updated quickly.
Still, you basically have to pin most packages if you have a lot of dependencies and want a stable experience.
If you need a bug fix, I can see the appeal of upgrading exactly one thing. But Julia 1.x to 1.(x+1) will almost always be safer than upgrading a package x.y.z to x.y.(z+1).
(FWIW, this example is because of one package relying on another’s internals gratuitously. But packages do that a lot.)

1 Like

My impression is that it is actually pretty common for packages to muck around with Base internals, so it’s not always possible to update the Julia minor version while keeping all other package versions fixed. DataStructures.jl and DataFrames.jl are two popular packages that use Base internals. In particular, DataFrames.jl overloads Base.dotview and Base.dotgetproperty, which are not a part of the publicly documented API for Base Julia.

See this issue for more discussion of the DataFrames situation:

4 Likes

This seems like a terrible practice. Both just add new data structures, which is the kinda of thing I never expected to mess with the internals. I would assume JuliaInterpreter, SnoopCompile, and Cassette need to do so and be vigilant about it, but would never have assumed the same about Dataframes.jl or DataStructures.jl.

6 Likes

Among recent breakages that I remember - 1.9 broke SortAlgorithms, and versions before the corresponding fix don’t even precompile. So, lots of packages effectively broken by 1.9, because SortAlgorithms is a very popular dependency (eg through StatsBase).

Sure, it happens because they used some internals not covered by semver. But for the user this doesn’t change anything.

Would be interesting to test this systematically from time to time. Run tests of all packages with Julia 1.x, fixing the registry state to the release date of Julia 1.(x-1).0.

IME, updating a package this way almost never breaks anything. It helps that packages typically have much fewer dependents than Julia itself (everything depends on it).

Should´t the authors of the package make a PR to Julia to make those functions part of the documented API, if there is no solution not involving using them? That might be accepted or not, but if it is not, at least some discussion on alternatives will be stimulated, and perhaps an alternative arises.

8 Likes

I’m sure we could find more examples of using or overloading Base internals. Here are some from StaticArrays.jl:

The second two should definitely be removed. They aren’t needed by anyone outside of base (they are just used for bootstrapping).

5 Likes

Any package that uses Julia internals and has a Julia compat that looks like this,

[compat]
julia = "1.6"

is using SemVer wrong. The compat should actually be like this,

[compat]
julia = "=1.6.7"

because even a patch release of Julia could potentially break the package. Unfortunately, I think there are a lot of packages using Julia internals with a compat entry like the former rather than the latter.

6 Likes

1.9 is a bit of a special case, and nightly/master doesn’t count at all, because they are not actually released yet. I don’t know the specifics of the SortingAlgorithms breakage. Is it Add comment to broken test and fix it by using Base.Sort.InitialOptimizations by LilithHafner · Pull Request #70 · JuliaCollections/SortingAlgorithms.jl · GitHub? That seems to only be about unreleased versions of Julia. Again, doesn’t really count, because the PkgEval runs only happen before release candidates and releases.

1 Like

It is fair to say that 1.9.0 cannot — by definition — break anything because it is not released yet, and that if you know of such a breakage that you should report it as a new issue immediately so we can address it in the rapidly progressing 1.9.0 final release.

But I imagine the SortingAlgorithms change of note here is JuliaCollections/SortingAlgorithms.jl#63, which was a carefully considered one-two step by a core contributor alongside JuliaLang/julia#47383… which was an explicit refactor such that the package no longer needs to rely upon internals. This is exactly the kind of stabilization that folks are clamoring for here.

Yes, if you (or some package) have SortingAlgorithms pinned to v1.0 or prior, it won’t work with Julia v1.9. But there has always been a version of SortingAlgorithms that has worked — even against the nightly build, even immediately after the refactor’s merge. Note the merge dates: SortingAlgorithms was prepared for the refactor a month ahead of time, and we’ve had nearly half a year to move to SortingAlgorithms v1.1. And only one now-archived package pins it to anything less.

So is this really an ecosystem-wide breakage? That feels a little unfair.

The more interesting test, in my view, is to use the latest patch or nonbreaking release that is compatible with the version available at the time of Julia 1.(x-1).

7 Likes

Isn’t that called type piracy, and packages shouldn’t do that in the ecosystem, no more than rely on internals in Base?

And the opposite is encapsulation, what I learned to be theoretically better… I’m conflicted here, allowing you to access the internals is powerful (sometimes needed), so shouldn’t be totally banned(?). Should it be banned by default in the (open source) package ecosystem that follows SemVer (it seems to break it, or would it be better for packages that do this that all new releases update x of x.y.z?)?

It seems that for the exceptions, something needs to be done. In C++, encapsulation is the rule (for classes), and private the default, in Julia pubic the default. But in C++ there’s another option, a friend class. Should some packages be named as friends, and evolve together? [While I’ve learned of friends in C++, I’m never used it myself, and never even heard of it used… but maybe I didn’t do too much C++… and it’s common?]

This is an avoidable situation if packages correctly specify their compat bounds. Version 1.0 of SortingAlgorithms should have had a compat similar to this:

[compat]
julia = "1.0.0 - 1.8.5"

DataFrames should do the same, since they also rely on undocumented Base internals.

Similarly, any package that relies on the internals of another package, like KernelDensity in the example above from @Elrod, should place a strict upper bound for the dependency at the most recent version (down to the patch number) that is known to work.

The tacit acceptance of this sort of compat fuzziness that I am seeing from core devs is discouraging. It gives me no pleasure to say this, but compatibility issues in the Julia ecosystem due to incorrect compat bounds are contributing to the sense that “Julia has correctness issues”.

There was a Github issue or PR somewhere about adding documentation for a similar compat issue, but I can’t find it right now. The issue is that any package that uses

using SomePackage

instead of

using SomePackage: foo

actually needs to place a strict upper bound on SomePackage, because SomePackage could add a new function in a minor release that creates a namespace collision. So, this is another case where loads and loads of Julia packages have incorrect compat bounds for their dependencies.

These various compat issues would probably be made a little easier if Julia had some way of marking public vs private API. I’m not sure exactly how it would work, and I’m not saying that accessing private API should be completely disallowed, but, if nothing else, it would help package authors to be more cognizant of when they are relying on internals of Julia or of other packages.

7 Likes

Is that really helpful? Practically the effect is the same: SortingAlgorithms v1.0 cannot load on Julia v1.9. Surely it’s worse to have a situation where you’re guaranteeing that each and every minor release of Julia is incompatible (by rule) with all prior versions of DataFrames.

This is why we run PkgEval.

1 Like

The issue with XGBoost.jl and LIBSVM.jl breaking within the Julia 1.8 series is very unfortunate.

5 Likes

Imagine I have a Project.toml file with the following compat section, and I’m running Julia v1.9:

[compat]
julia = "1.9"
SortingAlgorithms = "~1.0"

The package manager will happily install this environment, because SortingAlgorithms v1.0 allows any Julia version less than 2.0. But the project will be broken. I’ll say it again: the package manager will have installed a feasible set of package versions, and yet the project will be broken. That’s a correctness issue. This wouldn’t happen if SortingAlgorithms v1.0 upper bounded Julia at 1.8.5.

Besides, is it really so controversial that we actually follow semantic versioning within the Julia ecosystem?

I’ve tried to keep things simple by recommending a version bound like julia = "1.0.0 - 1.8.5", but even that is not necessarily correct for a package that relies on Julia internals. For a package that uses Julia internals to list that range in the compat section, it would need to test that the package works on every single minor and patch version of Julia within that range. That might sound pedantic, but that is in fact what semantic versioning requires. This should serve as a caution to package developers to avoid using internals of other packages (or Julia) unless absolutely necessary.

7 Likes

The issue is that any package that uses

using SomePackage

instead of

using SomePackage: foo

actually needs to place a strict upper bound on SomePackage, because SomePackage could add a new function in a minor release that creates a namespace collision.

Wow you have a fat point here! Noticing this for updating my packages…

From what I understand, in general Julia tries to be quite liberal with things in order to be able to explore possibilities.
Which feels as quite the opposite approach compared to C++ and even more so Rust.

Regarding the public/private distinction AFAIK there is the convention that everything exported is public, everything else is assumed private though you can import both exported and and “private” symbols via using SomePackage: foo.
IMHO it should be more visible when this privacy barrier is broken. Also, sometimes it makes sense to force qualified use of symbols by not exporting them, while still assuming them as part of the public API.

So yeah it seems there is space for improvement, and I am not sure what would be possible before 2.0…

3 Likes

I found the Github docs PR regarding recommending using SomePackage: foo over using SomePackage. The PR is still open.

2 Likes

This is not a great idea for a few reasons:

  • It means we cannot even attempt to run PkgEval on the package for new Julia versions so if the package is somewhat popular we are in the dark about what effect a new Julia release will have on the ecosystem.
  • It causes a lot of ovehead for package authors. They themselves cannot even test the package on newer Julia versions, they need to keep a separate branch where the compat is restricted, or they restrict it on the master branch but then you need to do backports to make releases available for the current Julia version.
  • Generally, if a change to internals in Julia causes a huge disruption in the ecosystem, we try to rejigger things so that the old stuff still works. Here are some examples: https://github.com/JuliaLang/julia/blob/3b993a958900d7f3b1aa064f2bf9e917edae0f79/base/deprecated.jl#L271-L294
12 Likes