Thoughts on eventual Julia 2.0 transition

That’s what @test_throws is for…

1 Like

Thanks, good to know that.

julia> 'a' in "Julia"
true

A string is a collection of characters, so if you want to test if the character 'a' is part of the string then test the character, not the string "a".

7 Likes

OK. :slight_smile: Sorry I thought that it is clear that nalimilan and me are speaking about substrings.

I probably had to write example like this:

julia> "lie" in "Julie"
ERROR: use occursin(x, y) for string containment

julia> occursin("lie", "Julie")
true

The work and time which is spent into compat.jl to make packages work after a language upgrade is better used for other things. It is a short-term gain, and a long-term problem. The lazy package maintainer has no incentive to update his/her package because compat.jl already has done the work. But later on the package update will be much harder and chances will encrease that the package will not follow and be abandoned. The natural selection or deselction of packages is postponed and the problems will become larger with time.

Better to abandon a package and its usage earlier, because there is no compat.jl, than postpone the same decision for later, because for now I can rely on compat.jl.

I know this is an extreme position. The same with the depth of dependency and the 100% coverage. The real world solution has to be more relaxed, I know that.

But currently there is no rule:

  • a runtest.jl must exist and it must pass. Thats all. It could just contain @test 1 == 1
  • code coverage is not checked at all
  • dependency depth can be excessive, even circular (don’t know what happens, probably an error)

This is definitely not enough to produce high quality and (long-term) reliable package ecosystems. But high quality and reliable packages are crucial for the long-term success of julia. Performance is a very good start but on the long-term not enough. (Surely this depends on what you expect to be a success in the future).

My understanding is that Compat.jl usually backports new syntax, so you update but it will work in the earlier version. This design allows a very smooth transition (just remove using Compat etc) when you want to drop support for earlier versions.

4 Likes

You have never used compat, right?

Let’s say my code allocates a vector. In julia 0.6 I wrote v = Vector{Int}(N). In Julia 1.0 one writes v = Vector{Int}(undef, N). There is no overlap: It is frankly impossible to write code that is compatible with both 0.6 and 1.0. Like python print x vs print(x).

That sucks, because I am incentivized to wait with the switch until everybody I care about has also made the switch to 1.0. You see the problem?

Enter compat.jl. Now I write v = Vector{Int}(undef, N) and, using compat, this code will run fine under 0.6 as well as 1.0. The using compat is essentially a no-op on 1.0. You see the advantage? I am incentivized to make the switch to 1.0 as early as possible, even if my codebase will mostly be run under 0.6 for the moment.

I would like to see a split of compat, though: compat1.0 and compat1.1, for backporting 1.0 features and backporting 1.1 features. Both compat versions have essentially nothing to do with each other and should not share the same name.

7 Likes

The main reason I have not upgraded from Python 2 to 3 for quite a long time is that I think Python is good and convenient enough and some package does not work in version 3.

I have used Python for more than ten years. I just upgraded to Python 3 last month, because most of my colleagues are using Python 3, some of whom learn Python from version 3. Last week, I updated a big project containing hundreds of source files with the help of 2to3 tool, and finished within several hours. Everything works quite well.

I like Julia because of the simplicity, elegance, and high-performance. For me, I think Python just has the first one or two features. I like the Zen of Python language degin PEP 20. It would be nice if Julia has the similar official one.

The Zen of Python

  • Beautiful is better than ugly.
  • Explicit is better than implicit.
  • Simple is better than complex.
  • Complex is better than complicated.
  • Flat is better than nested.
  • Sparse is better than dense.
  • Readability counts.
  • Special cases aren’t special enough to break the rules.
  • Although practicality beats purity. Errors should never pass silently.
  • Unless explicitly silenced.
  • In the face of ambiguity, refuse the temptation to guess.
  • There should be one-- and preferably only one --obvious way to do it.
  • Although that way may not be obvious at first unless you’re Dutch.
  • Now is better than never. Although never is often better than right now.
  • If the implementation is hard to explain, it’s a bad idea.
  • If the implementation is easy to explain, it may be a good idea.
  • Namespaces are one honking great idea – let’s do more of those!
2 Likes

No, I have not used compat.jl yet, but this is not because I don’t like the idea. It was just not needed.

What you describe is for me exactly the road into the dependency hell.

These are good arguments pro compat.jl. You could nearly use the same arguments to put deprecated functions into compat.jl or whatever is needed for a major language upgrade. When this is done, why should we ever drop compat.jl? It is there, it is good, it makes everything smooth. Great.
So why is it not done like that? Because it will be awfull to maintain in the long-term. The benefit to cost ratio is only good for the short.

And if there is no long-term disadvantage I am all for the short-term advantage. If this is the case for compat.jl than why not?

My suggestions are probably to extreme, I knew that before I suggested them. But agreeing on that, does not mean, that the Julia people should ignore the topic. It is about future transitions and that is not only about julia it is mainly about the package system. The packages will be the main reason to use Julia, not the performance (except for niches). In my opinion Julia has the potential to be the first choice language in many fields, in nearly all fields, again except special niches. If this is not the goal, then we have no need to discuss this. For niches Julia will always be perfect because of performance. In this case the packages are of a minor role.

So you may understand my suggestions better when you know, that I am talking about Julia to replace R, C, C++, Python, php, Java, JavaScript, C# and many others. (I know this won’t happen, but it is the goal which defines what you will reach).

1 Like

Could you explain this more precisely please?

1 Like

This is basically the plan for JuliaPro going forward.

3 Likes

A package X.jl is moving from 0.6 to 1.0 using compat.jl for those, who don’t move to 1.0 but still use the package with julia 0.6. The user of this package is updating his packages (Pgk.update()) and gets the new versions, probably over time with new functions, which he happily uses. The user may be not able to move to 1.0 because some of the other important packages he uses is abandoned, so he stays with 0.6. Time moves on, X.jl is droping compat.jl for 0.6. From this time point new developments of X.jl are now not available for the user anymore, but during the long time he is highly dependent on X.jl and still on the other one which was abandoned package. So he is now stuck. He can’t move to the current julia version to get new features of X.jl (or bug fixes) and moving all the existing code he has now would be more work than doing it new. Looking back it would have been much less work if he would have been forced to move to 1.0 together with X.jl at the time.

Now this is not yet the dependency hell. But it starts like that and there are typically more than one or two packages involved, but 10 or more and each of these have dependencies on their own and so on.

I want to chime in here because I strive for this but there are cases where you simply cannot get 100% coverage. An example I recently ran into: consider code where you open a file (assume successfully), and then you do something with the IO object in a try/catch block. Here’s the concrete example in my case:

function savegraph(fn::AbstractString, g::AbstractGraph, gname::AbstractString,
     format::LGCompressedFormat)
     io = open(fn, "w")
     try
         io = GzipCompressorStream(io)
         return savegraph(io, g, gname, LightGraphs.LGFormat())
     catch
         rethrow()
     finally
         close(io)
     end
...

How do you reliably test that try/catch block? You could, I suppose, induce a race condition whereby the disk gets filled up between the time the file is opened for writing and the time savegraph is called, but that seems to be a crazy solution to get a couple of lines of code coverage that tests a system level failure.

3 Likes

Yes this is another good example for an exception from the 100% rule, like every try…catch, thats why this is called exception handling :grinning:

The idea behind good coverage of code in test routines is, that I can easy and even automatically test, if a package runs in general on a given version of julia. For this to achieve one needs a good coverage, but not 100%, because, in your example, the exception is an exception, so if the package fails in this case, because rethrow() is not part of the julia version, it doesn’t matter, because it is an exception and not the normal flow.

What I want to do: I am somehow happy with julia 0.7 and I am not able to move to 1.0, because some absolutely important packages I need are still only available for 0.7 or below. Another great package developer starts developing his package GreatStuff.jl with 1.0 and adds julia 1.0- to his Require file. Now I want to install GreatStuff.jl in julia 0.7 and clearly Pkg.test(“GreatStuff”) will work out fine and I can use it with proper caution but it will not break my system if I use it. Happy I am and I can solve my daily tasks.

In this case we knew this beforehand, things with julia are still easy and not so complex, systems are small, more or less experiments, not yet production.

Now imagine large production systems with deep nested dependencies of packages, own development and external ones, highly meshed, millions lines of julia code. I want to see, that these systems still can be moved forward to new major julia versions with reasonable effort. This must be the goal for a general purpose language.

1 Like

Hmmm, putting in 1.0 as minimum requirement is a bit too ambitious in my opinion. I’d contact the developer and ask him to change it to 0.7. At least I see no reason why I’d force someone to +1.0 instead of +0.7…

2 Likes

Becasue he just started his julia career with 1.0 and didn’t know about 0.7 as a transition version from 0.6.

The whole point is not about a single package. Its about a huge package ecosystem with dependencies and users which rely on that ecosystem over years and with large production systems.

An interesting blog post about eventually transition Go to version 2.0:

https://blog.golang.org/toward-go2

In particular,

Go 2 must bring along all those developers. We must ask them to unlearn old habits and learn new ones only when the reward is great. For example, before Go 1, the method implemented by error types was named String . In Go 1, we renamed it Error , to distinguish error types from other types that can format themselves. […] That kind of clarifying renaming was an important change to make in Go 1 but would be too disruptive for Go 2 without a very good reason.

1 Like

It is just position (maybe I had not describe it clearly with my poor English): “Don’t count your chickens before they hatch.”

Especially if there is still possibility that some debacle could happen.

1 Like

I found another place where this is not true! There is

broadcastable(x::Union{Symbol,AbstractString,Function,UndefInitializer,Nothing,RoundingMode,Missing,Val}) = Ref(x)
...
broadcastable(x) = collect(x)

inside broadcast.jl in standard library.

So we get (unexpectedly if there is understanding String as collection):

julia> mask = occursin.("ab", ["bc", ])
1-element BitArray{1}:
 false

If we want String as collection we need to do:

julia> mask = occursin.(collect("ab"), ["bc"])
2-element BitArray{1}:
  false
 true

But we don’t usually look at Regex as collection so we could probably think about:

julia> Base.broadcastable(x::Regex) = Ref(x)

julia> mask = occursin.(r"ab", ["ba", "ab"])  
2-element BitArray{1}:
 false
  true

julia> mask = occursin.([r"ab"], ["ba", "ab"]) # this trick is not necessary after redefining Base.broadcastable for Regex. 

We probably want to add Regex to “self-reference-broadcastable” types.

Is this part of non breakable compatibility for 1.x or we could change this to Julia 1.3 (or 1.4?) after deprecation period? Or we could probably add it immediately because collect(x::Regex) is error in current implementation?

As this is big example where Strings are not treated as collections, does it help to convince you that it could be good to redefine this?

in(::AbstractString, ::AbstractString) = error("use occursin(x, y) for string containment") 
1 Like

Yes, strings behave as scalars in the context of broadcasting because it’s generally much more useful. I agree that’s inconsistent, but I guess convenience trumps systematicity in some situations. Anyway, that’s not an argument to change the behavior of in, as it wouldn’t be consistent with either of the existing behaviors.