Thoughts on eventual Julia 2.0 transition

These discussions are most important for the future success of julia.

This is what I would like to see for the future of julia:

Prerequisites (or maybe facts):

  1. language upgrades can always be breaking
  2. maintaining compatibility is costly and is hindering progress
  3. abandoned community packages are making upgrades of production systems impossible
  4. complex dependencies of community packages to each other slow down language upgrades heavily

Julia conduct of language upgrades:

  • minor upgrade (should) never break anything, major upgrades surely will

the following for major upgrades only:

  • compatibility has no or very low priority (expect breaking changes for sure)
  • no time is wasted for compat.jl (no priority for compatibility), drop compat.jl in favor of:
  • deprecation warnings for every breaking change together with version information, change advice and reference to the decision dicussion are maintained in standard packages (e.g. package deprecations_060_070.jl)
    when upgrading from 0.6 to 0.7 these are just warnings
    after upgrading to 1.0 (or any other higher version) same information but clearly a breaking error
    those packages remain for all future as far as possible, e.g. upgrading from 0.6.2 to 3.0.0 still produces helpful error messages
  • after major release, only the last major version is maintained (1.0.0, 0.6 maintenance, 0.5 frozen)
  • new major release only if last maintenance release is reasonable stable (no need for hurry)
  • only one road of development, no branches, no LTS versions
  • each installation of a major julia version should be completely independent and self-contained (as it is now).
    This should be also the case for distributions (debian,centOS,…).
    It should always possible and no problem at all to install many julia versions on a single system without ever having trouble with that. This is currently the case.

Packages and dependency hell:

  • Make hierarchy of package dependencies flat:

Packages with no dependency:

  • consider those as candidates for absorption into standard library if widely in use and of good generality

Packages with dependencies (using,import):

  • allow only one level of depth => an imported package can not import other packages, must be a package without dependency

All official Packages:

  • test coverage must be 100%
    this allows for automatic version compatibility test for lower and upper julia version bounds. I am thinking here of lazy developers, who just define their current development version as a lower bound, where you are not able to install that package despite it would run without problems on lower julia versions (thinking of R packages here).
1 Like

While personally I strive for this, some developers consider it excessive. Also, some widely used “official” packages have much less.

Its not always possible to have 100% test coverage, e.g. if there are special hardware dependencies (GPU) which can not be setup for the automatic test systems. In these cases the exception of the rule maybe an option. But in general a 100% coverage should be enforced. It is no problem to start to enforce for the next major release.

Another important feature which I forgot (I will edit it into the post above):

  • each installation of a major julia version should be completely independent and self-contained (as it is now).
    This should be also the case for distributions (debian,centOS,…).
    It should always be possible and no problem at all to install many julia versions on a single system without ever having trouble with that. This is currently the case.

Of course you want “high” coverage, but you should also be reasonable. 100% test coverage isn’t even possible in many cases. Trying to get 100% test coverage goes against defensive programming. Take for example a long running stochastic algorithm (stochastic differential equation solver?). You can reason mathematically that 1/100,000 times you may need an extra branch to be statistically correct. You should add that branch even if it will never be possible for CI to generate enough cases to hit that (too time consuming) because it’s more correct. You should add things to handle cases that you may not know how to directly trigger quite yet, not make things less correct in order to achieve an arbitrary 100% coverage number.

5 Likes

Maybe I am wrong, but my understanding of 100% code coverage is, that each line of code in a package is at least hit once during tests. It is not about 100% correct code. So I believe that 100% code coverage in this sense can always be achieved. And if it is not possible for a real world test case it should always be possible for an artificial call to the subroutine.

There are still exceptions like hardware dependent code not available on the test system.

In general test routines of not trivial functions are always (or with high propability) special cases and will not guarantee 100% correct code. So nothing is lost here if 100% code coverage is expected even if it is obtained only with artificial test calls.

But the big advantage is, that it is easy to check, if the package runs in general on the current version of julia just by calling Pkg.test(“name_of_package”).

The reason why I have proposed such a thing is, that I suffer for years (even decades now) from the mess the package system is in R and Bioconductor, where packages are nearly always some kind of PhD work and afterwards more or less not maintained.

Being highly restrictive on community driven software ecosystems is not wrong in my opinion.

100% coverage would mean that the error catching code would be exercised as well (example) … and that would make the tests fail.

Packages with dependencies (using,import):

  • allow only one level of depth => an imported package can not import other packages, must be a package without dependency

No way will this ever work. Sure, left_pad is to be avoided, but deeper dependencies are essential.

All official Packages:

  • test coverage must be 100%

https://github.com/auchenberg/volkswagen

no time is wasted for compat.jl

And how do you expect packages that are not significantly affected by some breaking change to stay compatible with multiple versions of the language? At some points separate git branches are needed, but a lot of code can run nicely on either 0.6 or 1.0, when using compat.

The only thing I’d really like to see is some officially blessed curated selection of packages. Just like e.g. debian has an officially blessed curated selection of open source packages that is unlikely to contain non-working packages or malware (except for broken key generation). This would be a small subset of the current package ecosystem, and also help discoverability, and significantly centralize the transitions.

People would still be free to use other packages, maybe similar to the archlinux AUR system. Something like pkg> add contrib/foo_pkg vs pkg> add official/bar_pkg, with the only restriction that official packages can only depend on other official packages.

That’s what @test_throws is for…

1 Like

Thanks, good to know that.

julia> 'a' in "Julia"
true

A string is a collection of characters, so if you want to test if the character 'a' is part of the string then test the character, not the string "a".

7 Likes

OK. :slight_smile: Sorry I thought that it is clear that nalimilan and me are speaking about substrings.

I probably had to write example like this:

julia> "lie" in "Julie"
ERROR: use occursin(x, y) for string containment

julia> occursin("lie", "Julie")
true

The work and time which is spent into compat.jl to make packages work after a language upgrade is better used for other things. It is a short-term gain, and a long-term problem. The lazy package maintainer has no incentive to update his/her package because compat.jl already has done the work. But later on the package update will be much harder and chances will encrease that the package will not follow and be abandoned. The natural selection or deselction of packages is postponed and the problems will become larger with time.

Better to abandon a package and its usage earlier, because there is no compat.jl, than postpone the same decision for later, because for now I can rely on compat.jl.

I know this is an extreme position. The same with the depth of dependency and the 100% coverage. The real world solution has to be more relaxed, I know that.

But currently there is no rule:

  • a runtest.jl must exist and it must pass. Thats all. It could just contain @test 1 == 1
  • code coverage is not checked at all
  • dependency depth can be excessive, even circular (don’t know what happens, probably an error)

This is definitely not enough to produce high quality and (long-term) reliable package ecosystems. But high quality and reliable packages are crucial for the long-term success of julia. Performance is a very good start but on the long-term not enough. (Surely this depends on what you expect to be a success in the future).

My understanding is that Compat.jl usually backports new syntax, so you update but it will work in the earlier version. This design allows a very smooth transition (just remove using Compat etc) when you want to drop support for earlier versions.

4 Likes

You have never used compat, right?

Let’s say my code allocates a vector. In julia 0.6 I wrote v = Vector{Int}(N). In Julia 1.0 one writes v = Vector{Int}(undef, N). There is no overlap: It is frankly impossible to write code that is compatible with both 0.6 and 1.0. Like python print x vs print(x).

That sucks, because I am incentivized to wait with the switch until everybody I care about has also made the switch to 1.0. You see the problem?

Enter compat.jl. Now I write v = Vector{Int}(undef, N) and, using compat, this code will run fine under 0.6 as well as 1.0. The using compat is essentially a no-op on 1.0. You see the advantage? I am incentivized to make the switch to 1.0 as early as possible, even if my codebase will mostly be run under 0.6 for the moment.

I would like to see a split of compat, though: compat1.0 and compat1.1, for backporting 1.0 features and backporting 1.1 features. Both compat versions have essentially nothing to do with each other and should not share the same name.

7 Likes

The main reason I have not upgraded from Python 2 to 3 for quite a long time is that I think Python is good and convenient enough and some package does not work in version 3.

I have used Python for more than ten years. I just upgraded to Python 3 last month, because most of my colleagues are using Python 3, some of whom learn Python from version 3. Last week, I updated a big project containing hundreds of source files with the help of 2to3 tool, and finished within several hours. Everything works quite well.

I like Julia because of the simplicity, elegance, and high-performance. For me, I think Python just has the first one or two features. I like the Zen of Python language degin PEP 20. It would be nice if Julia has the similar official one.

The Zen of Python

  • Beautiful is better than ugly.
  • Explicit is better than implicit.
  • Simple is better than complex.
  • Complex is better than complicated.
  • Flat is better than nested.
  • Sparse is better than dense.
  • Readability counts.
  • Special cases aren’t special enough to break the rules.
  • Although practicality beats purity. Errors should never pass silently.
  • Unless explicitly silenced.
  • In the face of ambiguity, refuse the temptation to guess.
  • There should be one-- and preferably only one --obvious way to do it.
  • Although that way may not be obvious at first unless you’re Dutch.
  • Now is better than never. Although never is often better than right now.
  • If the implementation is hard to explain, it’s a bad idea.
  • If the implementation is easy to explain, it may be a good idea.
  • Namespaces are one honking great idea – let’s do more of those!
2 Likes

No, I have not used compat.jl yet, but this is not because I don’t like the idea. It was just not needed.

What you describe is for me exactly the road into the dependency hell.

These are good arguments pro compat.jl. You could nearly use the same arguments to put deprecated functions into compat.jl or whatever is needed for a major language upgrade. When this is done, why should we ever drop compat.jl? It is there, it is good, it makes everything smooth. Great.
So why is it not done like that? Because it will be awfull to maintain in the long-term. The benefit to cost ratio is only good for the short.

And if there is no long-term disadvantage I am all for the short-term advantage. If this is the case for compat.jl than why not?

My suggestions are probably to extreme, I knew that before I suggested them. But agreeing on that, does not mean, that the Julia people should ignore the topic. It is about future transitions and that is not only about julia it is mainly about the package system. The packages will be the main reason to use Julia, not the performance (except for niches). In my opinion Julia has the potential to be the first choice language in many fields, in nearly all fields, again except special niches. If this is not the goal, then we have no need to discuss this. For niches Julia will always be perfect because of performance. In this case the packages are of a minor role.

So you may understand my suggestions better when you know, that I am talking about Julia to replace R, C, C++, Python, php, Java, JavaScript, C# and many others. (I know this won’t happen, but it is the goal which defines what you will reach).

1 Like

Could you explain this more precisely please?

1 Like

This is basically the plan for JuliaPro going forward.

3 Likes

A package X.jl is moving from 0.6 to 1.0 using compat.jl for those, who don’t move to 1.0 but still use the package with julia 0.6. The user of this package is updating his packages (Pgk.update()) and gets the new versions, probably over time with new functions, which he happily uses. The user may be not able to move to 1.0 because some of the other important packages he uses is abandoned, so he stays with 0.6. Time moves on, X.jl is droping compat.jl for 0.6. From this time point new developments of X.jl are now not available for the user anymore, but during the long time he is highly dependent on X.jl and still on the other one which was abandoned package. So he is now stuck. He can’t move to the current julia version to get new features of X.jl (or bug fixes) and moving all the existing code he has now would be more work than doing it new. Looking back it would have been much less work if he would have been forced to move to 1.0 together with X.jl at the time.

Now this is not yet the dependency hell. But it starts like that and there are typically more than one or two packages involved, but 10 or more and each of these have dependencies on their own and so on.