Discussion on "Why I no longer recommend Julia" by Yuri Vishnevsky

Building on what @JohnnyChen94 with specific focus on this point:

While perhaps the blog post does not provide a single, well-honed argument directed towards the language, there are some very clear takeways. For one, it references legitimate shortcomings in “older and more mainstream” libraries in the Julia ecosystem. Sure, we do not expect new libraries to emerge fully-formed and ironclad, but bedrock infrastructural components with many years of development and (nominally) more eligible maintainers for continued development can be held to a higher standard.

Now, one challenge that has been brought up wrt issues around interop/composability is that we don’t know what we don’t know when it comes to users combining libraries in novel ways. While I empathize with that perspective, I would posit that in many cases we can either anticipate issues or have seen them before. In that light, things are not so hopeless and there are actionable things package maintainers can do:

  1. Make interop work correctly. This is currently constrained by not having glue packages, but in many cases one package will be a dep of another.
  2. If 1) is not possible, error and warn when a problematic combination of inputs are detected. An in-your-face message is difficult to miss!
  3. If runtime checks are not feasible, then @mkitti’s suggestion sounds great. 🔪 JAX - The Sharp Bits 🔪 — JAX documentation may provide some inspiration here. My only addition would be that the list should be accessible directly from the README and/or docs landing page.

All of the above are an improvement over the status quo of users scouring issues and discourse posts, often without a good set of keywords. Heck, even as a package maintainer I often find myself wasting half an hour here or there finding related issue reports. All this work does rest on the assumption that core libraries in the ecosystem have enough dev capacity (looking at number of contributors, commit and release frequency) to tackle it and are not suffering from a case of XKCD #2347. If that is not true for particular packages despite outward appearances, then perhaps there is a more fundamental issue at hand.

3 Likes

There is a quoted tweet

Is Julia really appropriate for high-assurance real-time control systems at this point? I find this unlikely given the immature static analysis tools and the difficulty of managing allocation/collection pauses.

The “two-language problem” that people usually talk about is really a “two requirements problem”: fast programming and fast throughput, which is sometimes addressed by Python and C++. There are reasons for using C++ other than “high throughput” and, for that matter, reasons for using Python other than “fast programming”.

I think as a community we should be careful to avoid pattern matching on “Python and C++” as a generic source of potential Julia users. Julia is primarily targeted at one use case of those languages, and users with other requirements will be disappointed by Julia.

11 Likes

So it seems most users agree that the implementation of eachindex is correct. However, the docstring requires that for arrays only two return values are allowed: either 1:length(A) (which clearly must not be returned for OffsetVector and is not returned) or “specialized Cartesian range” (which it does not return). While it makes sense that IdOffsetRange is returned the point of my question is the following:

  • we write a generic function accepting AbstractArray x;
  • inside the function we run Base.IndexStyle(typeof(x)) and for OffsetVector we get IndexLinear()
  • given eachindex docstring we expect we can assume that eachindex would return 1:length(x) - this assumption is wrong though.

My conclusion is - in accordance to what @Henrique_Becker has written - that the crucial problem is eachindex docstring. The issue is that IMHO we need to carefully review contracts that Base Julia functions provide to make sure they are precise and easy to understand for developers so that when they add methods to such functions they can correctly implement them. Unfortunately this task is quite hard to achieve in practice.

24 Likes

Even though I’m very enthusiastic about Julia in general, this hits home. I’ve personally run into the problem of ChainRules returning incorrect results. Luckily, I’d come across the various discussions about bugs in Julia’s AD frameworks before, so as soon as I saw any kind of misbehavior in my program, checking the correctness of the derivatives was the first thing I did. I was able to identify the bug relatively easily and submit a bugfix, but it still cost 3-4 full workdays, so not an ideal situation. I’m somewhat prepared to always test the correctness of any third-party library for my particular use case, but it does add overhead, and it’s very much as sign of an immature ecosystem.

Contrary to the OP, I don’t really think that these growing pains cannot be overcome. It will probably just take time. After all, Julia is still quite young. But it is a problem, and in terms of the entities that can provide funding (NumFocus, JuliaComputing) it would be good to try to address these issues consciously and directly. That probably means finding ways have full-time paid software engineers in some of the core packages of the ecosystem, and to invest in tooling.

Also, the place where I personally feel Julia’s relative immaturity most is the lack of effective linting/code analysis/testing tools. There’s a lot happening, but it doesn’t yet come anywhere close to the level of tooling that e.g. Python has (although it’s probably a lot better than Python’s tooling at 10 years of age). Right now, it’s very hard to find misspelled variables, unused code paths, issues like the incompatibility with OffsetArrays that the article mentions, etc. VSCode claims to have some tooling, but either I’m not setting it up correctly, or it just doesn’t work very well (plus, I very strongly prefer vim, so independent command-line tools would be preferable).

36 Likes

Yeah while I like Julia of course, the author has a point. I don’t think it has to do with the lack of interfaces, but rather that the mantra that packages will “just work” with a whole ecosystem of independently developed packages is often repeated by the community, which is simply not true.

Part of the reason Julia has been so successful is that such extreme composability is possible, but I think the main point brought up by this article is that the ecosystem tends to assume that disparate packages will be composable, and if they are not, then there will be some obvious error message.

In my experience, unless the maintainers of packages are both very experienced, and put in effort to ensure continued compatibility (as e.g. with SciML), composition regularly breaks. Fortunately the outcome of breakage is usually just an impenetrable error message, but as this blog points out, it can sometimes result in bugs that are very difficult to detect.

As other people have concluded, I agree that the primary issue is that testing practice in Julia comes from other languages that are far less flexible, and that is not suitable for the enormous range of possible behaviors that widely used code can see.

19 Likes

Jo Schg wrote a good comment on slack about how the Rust community systematically values correctness in ways that we don’t. I won’t quote the comment in case they want it private, but the main reference is this video: Scale By The Bay 2018: Bryan Cantrill, Rust and Other Interesting Things - YouTube

In the spirit of that comment, I’ve written a little package that defines an AbstractArray type that will throw KeyError if you index it without using eachindex, firstindex, or lastindex.

You can find it here: GitHub - cmcaine/EvilArrays.jl

I don’t know if it’ll be helpful, but I hope so!

46 Likes

Most of the technical issues raised by the author are a consequence of two simple facts:

  1. The subset of the Julia community that is capable of developing relevant packages is too small. You often see the same faces maintaining 10 different packages in 5 different organizations. They cannot research, develop, document, maintain, fix bugs… of all these efforts at the same time and alone.

  2. The lack of interfaces or traits as a feature of the language leads to poor documentation of behavior. After years programming in Julia, I still struggle trying to understand what is needed to fully define a custom AbstractArray. Even a widely implemented interface such as Tables.jl is extremely hard to adhere to because there are no mechanisms to programmatically check if the implementation is doing what it is supposed to do. We often discover bugs at runtime when a 3rd-party package tries to use the table type in a non-trivial way.

To solve (1) we need some sort of mechanism to convert end-users into maintainers of packages.

To solve (2) I don’t know. I remember following various interesting discussions on Zulip about how a trait system in Julia could potentially worsen the compilation time issue.

67 Likes

The post presents some interesting points.

I’m very enthusiastic about Julia, and by far the most significant downside I see in the language/ecosystem is a general lack of documentation. I also do get the point of the author that the language offers great generality and that this comes at a price.

In fact, it’s very likely that most of the points raised by this author could be solved by keeping a list of “common gotchas!”, either in the base documentation or on each of the conflicting packages.

For example, I believe that OffsetArrays.jl is by no means should be considered a standard part of the Julia ecosystem. Anybody employing this package should be aware that it might not compose very well with any package employing the ubiquitous 1:length(A) syntax.

In fact, OffsetArrays implement a feature of Fortran that has been in the language forever, and yet nobody uses it, precisely because it’s so error-prone.

I find it a little weird that 1:length(A), even thou compatible with the default array behavior, should be considered harmful, but perhaps that’s the case. Newcomers to Julia, and specially package authors, should perhaps be told that using eachindex(A) is the best practice in view of AbstractArrays not necessarily being 1-based, and in particular, because of this silent conflict with @inbounds.

Passing the same object as two different arguments for an in-place function also looks like a red flag to me. It would be nice if this behavior resulted in an error.

Also, I’m not an expert in AD, but I wouldn’t trust it in general. I don’t have the reference in hand, but I know that AD (regardless of its implementation) can lead to very serious floating-point errors. I think testing for correct gradients should be considered the best practice in any given application, and that should be put forth in the AD packages documentation and tutorials… (perhaps even the test could be produced automatically for a given user function?).

As for Julia’s Flux not competing with PyTorch yet in ease of use, that’s a different ball game. And if we include self-driving cars in the discussion (which are still inherently unsafe) I believe we are totally missing the point.

18 Likes

Hm, I’m not sure.

I believe generic programming is hard and we are working with some kind of gradual typing system, which is rather new. The author shows the weaknesses of this approach without mentioning alternatives, which indicates to me that we are evaluating the difficult path.

3 Likes

Light criticism out of the way first: It’s fair to abstain from a rapidly developing thus buggy language, and it might be fair to expect that to take longer with developers who are academics first. However, the blog and its cited sources did not adequately explain how the Zygote-related bugs and some other issues are rooted in the “extreme generality” and composability of Julia. For example, the linked 05/03/2022 blog post on JAX vs Julia(ML) explicitly says “My criticisms above are primarily of its ML ecosystem (not the language) so with some effort I could see these being fixed.”

As for the OffsetArrays-related bugs, this seems like more-than-fair criticism (cold take by now). While I still have faith in composability, it doesn’t happen by accident, it happens through interfaces. A lot of headaches were prevented by documenting the interfaces of iteration, indexing, arrays, and broadcasting! But the expansion of the AbstractArray interface to non-1-indexing hasn’t resulted in universally updated code (another example: require_one_based_indexing is still called in a lot of places, including matrix multiplication code), and people like the author are running into an effectively unstable interface the hard way. Improving interface documentation to list assumptions in addition to interface methods would be great, but I can’t help but think the 1-indexing code should’ve been caught and forced to be fixed much earlier. Not sure how, it’s simpler to require and check a custom type for firstindex, it’s much harder to figure out for i in 1:n is a bug (maybe n isn’t the last index, maybe it’s supposed to fail for 2-indexed arrays). And it’s a tough ask to pore over the existing code base to make OffsetArrays work, and ahead of other important work like compilation (latency, CLI feasibility, lightweight executables).

I could imagine that when introducing firstindex/lastindex/eachindex, a 420-indexed vector could’ve been thrown into a package’s tests to see what breaks. On the other hand, it would be already very vigilant to do that when base Julia changes, and ridiculously vigilant to do so for a dozen separately updating packages that may be documenting their breaking interface changes. Maybe it would help to commit to interface stability more rigidly and clearly document expectations on composability, leaving the breaking changes to major version updates.

5 Likes

My short take on this post is that the interest in Julia seems to be out-growing the number of experienced developers and well tested packages we have, hence an increased likelihood of facing bugs as a newcomer and the lack of documentation. This is a funding problem. In a way, I think Julia is a victim of its own success and efficiency. We managed to build a lot of packages in a short period of time with a fraction of the resources needed to build similar packages in other systems using a high number of academics and grad students in their “free time”. But most of us who started to develop these packages never imagined their code will be critiqued on HN’s front page one day. For the comparison to be fair, I think people should consider the bugs per funding ratio for similar packages across programming languages.

The second problem we have as a community is over-hyping and “too much marketing” in the sense that we like to claim things just work. But they don’t! Or even if they do, somebody needs to sit down and “check” that they do before announcing that they do. I think putting more restraint on the expectations we are setting should be part of the “official statements” more often. Sure, it makes the marketing sound less cool but setting the right level of expectation is arguably more important than sounding cool.

82 Likes

I do not have clear evidence to validate or challenge the primary assertion that there are more bugs in Julia than other languages, so I cannot address that part. My general sense is that I usually find bugs proportionate to the amount of time I spend using the language.

As for dealing with bugs in general, we need to adopt a no bugs culture and develop the tooling to support it. This will be difficult and will require resources.

While it impossible to guarantee no bugs for an arbitrary set of packages, I do think it should be possible to significantly reduce the number of bugs for a pantheon of packages and the concrete types within. That is for Base, the standard library, and a set of mature community packages we should develop a comprehensive testing framework to ensure that at least within that particular environment that the packages can compose correctly. While we build on generics and abstract types, over a closed set of code we should be able to systematic iterate over the concrete types.

We will need tools to define interfaces and verify that code implementing those interfaces is compliant. This should involve:

  1. A language to define the interface
  2. An interface that says I’m implementing the interface within this block of code, and a way to verify that the interface is implemented at the end of the block.
  3. An automated testing framework to automatically iterate through concrete types within the set of packages and evaluate how they compose with interfaces.

We will need to expand tools like JET.jl and widen their use and adoption.

Increasingly what this sounds like is that we need to develop a sublanguage within Julia that is stricter than what Julia currently allows and that would be amenable to static analysis.

13 Likes

It is mindblowingly what the community has accomplished given the number of developers. But being more efficient doesn’t do you much good if you are massively outgunned and with clearer maintenance continuity. OffsetArrays is the wrong package to criticize here because it isn’t “really” necessary . Others are.

100%, and people get burnt - especially with AD. Some things work as advertised and are state-of-the art (e.g. SciML, JuMP, and a variety of smaller packages if you know where to find them). Others are an enormous hole chipping away at longterm usage (e.g. machine learning packages, even for niche use like scientific machine learning let alone standard ML stuff). More and more I find myself using Python frameworks because the deep-learning environment is much less infuriating and full-featured. This has nothing to do with Julia as a language, but in practice I am unlikely to switch back for those applications.

14 Likes

I think our existing tools could work well for those tasks, at least to start with:

  1. English to define the interfaces better in documentation (I think the documentation for AbstractArrays is quite thin);
  2. functions like test_array_type(MyCustomArray) for interface testing (like Invenia’s package on this);
  3. make packages called something like AlternativeArrays.jl that offer a variety of functions that accept an array and return an OffsetArray, or an AxisArray or whatever and then just go through the tests of other packages and add some for loops in there like these:
@testset "Alternative array compatibility" for f in alternative_array_factories
  @test unique(f([1, 1, 2, 3, 1])) == f([1, 2, 3])
end

Edit: And if it’s too much hassle to open a bunch of PRs on other projects, the composition tests can just be done in some package that exists only for running lots of tests. Might be easier to manage.

3 Likes

I disagree. Any package proposing an interface should just document it and implement a test suit that any object implementing the interface should be run against (by the developer of the package implementing such object). No need to over-complicate things.

7 Likes

Might be better just to use existing packages, which have other uses, and quirks you may not think of. So if you claim to accept AbstractArray, then test some StaticArrays, some OffsetArrays, some weird LinearAlgebra types like SymTridiagonal.

(Likewise if you claim to accept Number, test some complex numbers, some integers, and some other numbers like those from DoubleFloats, Measurements, Unitful.)

But the StatsBase functions with 1:length(x) don’t even seem to test whether they accept matrices.

7 Likes

Yes, I meant that existing packages should be reused. AlternativeArrays would just have some convenience functions for writing those for loops, and be a single place where new subtypes can be registered to get tested in multiple packages. Maybe that would turn out to not be worth a package. Maybe that should just go in the property testing library and “array type” and “number type” just become one of the many variable that the proptest library will search through for bugs. I guess we’ll find out if someone gives it a go!

The interface stuff only addresses some of the issues anyway. I think it’s plausible that we could reduce the bug-rate with concerted work to add more tests (hopefully proptests?) to core libraries, especially focussing on bug-magnets like missing.

I think we need more than this for common interfaces. AbstractArray is a relatively common interface. We need test tools that make it easy to test methods against a variety of AbstractArray subtypes.

1 Like

I agree, but I think those tools can just be regular functions from regular packages.

2 Likes

Actually, I believe not having a good definition of what is and what isn’t a “standard part of the Julia ecosystem” is part of the problem. For example, looking at the docs on docs.julialang.org there’s definitely mentions of OffsetArrays, even in examples.

E.g. in Single- and multi-dimensional Arrays · The Julia Language (my emphasis):

As such, when iterating over an entire array, it’s much better to iterate over eachindex(A) instead of 1:length(A) . Not only will the former be much faster in cases where A is IndexCartesian , but it will also support OffsetArrays, too.

And in code in Arrays · The Julia Language

julia> src = reshape(Vector(1:16), (4,4))
4×4 Array{Int64,2}:
 1  5   9  13
 2  6  10  14
 3  7  11  15
 4  8  12  16

julia> dest = OffsetArray{Int}(undef, (0:3,2:5))

julia> circcopy!(dest, src)
OffsetArrays.OffsetArray{Int64,2,Array{Int64,2}} with indices 0:3×2:5:
 8  12  16  4
 5   9  13  1
 6  10  14  2
 7  11  15  3

julia> dest[1:3,2:4] == src[1:3,2:4]
true

So as it is referenced from the official documentation, and without any qualification of “non-standard”, I would assume using OffsetArrays.jl is fine and accepted (that’s certainly how I interpreted it when I first learned about it). And the same would hold for, say, a package like StaticArrays.jl (mentioned explicitly here). Yet it seems use of StaticArrays is much more accepted than OffsetArrays (in my estimation), and there might be some differences in maturity of the packages, especially when composing with others. But how can one tell looking at the documentation and/or package metadata? And how about the hundreds of other packages out there, and their combinations in which to put them together?

My point being, having a clearer view on what is a “standard part of the Julia ecosystem” might help if it leads to more developer priority on those parts working correctly. For example, in case of Python the standard library is somewhat more extensive than Julia’s, but definitely the responsibility of the Python core devs to keep up-to-date and correct. Anything that isn’t in the Python standard library, i.e. every package you install separately, is not the primary responsibility of the core devs. And for me I trust the Python standard library modules, as they’ve shown to be mature and working correctly. With Julia I have a far more unclear view of what is and what isn’t managed by the core devs in terms of packages not included in the Julia distribution itself, and what that implies for code maturity, correctness and composability. It certainly doesn’t help in perception of the package ecosystem, as it’s hard to draw a line on what is and what isn’t standard (and thus mature).

Just my 2 cents

15 Likes