Discussion on "Why I no longer recommend Julia" by Yuri Vishnevsky

If you look at the git blame, you’ll see that that code is 8 years old: StatsBase.jl/src/deviation.jl at d9cb8f8d6198e5813562e1efffe74fe3614d047e · JuliaStats/StatsBase.jl · GitHub Pull requests are welcome!


There’s already a PR that (partially) fixes the offset-axes issue: https://github.com/JuliaStats/StatsBase.jl/pull/722.


I believe the sentiment of much of this is correct and must be taken seriously, but a disproportionate amount of them are about OffsetArray as far as I can tell? I understand he is using this to explain how the composability has failed, but it seems reasonably niche. Maybe that loosely-defined generic interface isn’t really as composable as one would like, but people can get by 1-based indices if necessary.

But as I said, I think the spirit of this must be taken very seriously. Not to mention discussing Zygote vs. Pytorch/JAX, which opens up a new can of worms.


I don‘t understand the problem. There are checks on the length of both arrays a few lines above that code.

Its not about the length. The code assumes that a and b have indices from 1:length(a) which is not the case for all AbstractArrays, for eg., OffsetArrays.


That’s one way of dealing with the issue. Another way is finding the abstractions that could “tame” the composability a bit, as mike put it:

I think it’s absolutely right to celebrate Julia’s approach to composition. I also hope new research (in Julia or elsewhere) will help us figure out how to tame it a bit.

Which is in spirit what I advocate for here . But not sure if its feasible at this late point.

The flexibility vs structure issue goes beyond composability into the correctness of compiler transforms and predictability of Julia performance model. Two concerns made more acute by AD, GPU and other things we now ask of Julia’s semantics, which is referenced by the zygote issue, but goes further. Julia’s full dynamism needs a touch of restriction, even if opt in, if it’s ever going to reach the promised full language diff programming+ composability+GPU. It’s currently trying to do that in ad hoc ways like immutable arrays and pure dl frameworks…but if you have purity without handling effects that’s just Jax (except without TPU, linalg optimizations, and inplace update copy elision (for now?)).

And Jax is already really good. I say this with love and a bit of disappointment but I think for Julia to succeed and needs work on the structure side not just the compiler side

Dex is a good example of a language that attempts to strike a balance (again see my post (State of machine learning in Julia - #25 by Akatz and the dex issue about adhoc polymorphism and purity but with effect handlers. Maybe Julia has another local optimum??? But it’s a hard design problem

I think it’s a fundamental existential problem for the language but unfortunately Yuri is correct that I don’t see it acknowledged widely. I’ve witnessed some explicit dismissal of alternative approaches (like language level traits and more fundamental approaches to handling mutation) and I think Julia is doing that at its own peril.

Though, how much Enzyme can help remains to be seen…but even if it works, it ties Julia to LLVM (so no compilation to XLA and TPUs) and unsure how well it will do with high level branching code.


I agree that it would be helpful to have an accessible system for interface tests (i.e. registering tests for AbstractArrays that the author of a custom array type can easily find and run against their own type). Invenia published a package and/or a workflow for interface testing, but it is not well known and it requires a fair amount of ceremony to get right. This could establish whether e.g. addition must be commutative for some abstract type, or important facts about how iteration should behave.

For the issue of packages using the interface of a value incorrectly, I think tooling and greater awareness of the issues could help. An important part might be some way of discovering what the interfaces actually are. That would be handy both for people implementing new types and for people using interfaces.

Simple tools like an easy way to run a package’s tests but replacing regular arrays with star wars arrays and with @inbounds turned off might be handy. Linters and property testing have been successful for other languages, I think.

I don’t see any quick way around the correctness issues with our statistics libraries. Perhaps someone could write a bunch more tests or copy tests from R or python libraries that do similar work?


The problem is it assumes the valid indices are 1:length(a). This cannot be assumed for an arbitrary AbstractArray:

julia> using OffsetArrays

julia> A = OffsetArray(1:10, -2)
1:10 with indices -1:8

julia> A[length(A)]
ERROR: BoundsError: attempt to access 10-element OffsetArray(::UnitRange{Int64}, -1:8) with eltype Int64 with indices -1:8 at index [10]
 [1] throw_boundserror(A::OffsetVector{Int64, UnitRange{Int64}}, I::Tuple{Int64})
   @ Base .\abstractarray.jl:691
 [2] checkbounds
   @ .\abstractarray.jl:656 [inlined]
 [3] getindex(A::OffsetVector{Int64, UnitRange{Int64}}, i::Int64)
   @ OffsetArrays C:\Users\kittisopikulm\.julia\packages\OffsetArrays\N7ji6\src\OffsetArrays.jl:428
 [4] top-level scope
   @ REPL[100]:1

This is the most tractable way to address this at the moment. We need testing tools to see if common interfaces such as the AbstractArray interface are being correctly used. These testing tools become the de facto definition of the interface.


This pull request was pointed out as addressing some of the issues.


Clearly there was a lot of effort put into this pull request. How can we help @Lilith get this pull request merged? It looks like there are some outstanding issues still from @nalimilan .

Also what can we suggest in general about pull requests to increase the merge rate?


I do not think the article is entirely fair, while it is of much greater quality than the average article criticizing Julia.

  1. The comparison is done to older and more mainstream libraries. While this is a legitimate viewpoint for choosing what to use for your work just now, well, it is not fair in the more general sense. The comparison would need to be between ecosystems/languages at the same level of maturity. The more interesting question to me is if Julia will have these problems when it reaches the same age and popularity of other languages/frameworks have now.
  2. While I do understand the systemic argument (and, in fact, I use it to subjects like racism and discrimination in general), the examples are a little lacking. Checking for aliasing does not seem for me a responsibility of most functions (i.e., you should not assume you can pass the same object as two distinct arguments unless stated otherwise), and the bounds problem is also more nuanced. The code may be right for the Julia version it was written, but inadvertently kept for newer version. I think my disagreement is rooted on a different perspective in which I do accept the extra flexibility of generality has the cost of having me check if the pieces actually work well together instead of assuming it will work flawlessly. The element of unfairness in this comparison, to me, is that we would need to be comparing to an equally flexible language, so Python is fair (just had a lot more time to mature), other languages do not allow the generality that Julia will allow, and therefore the bugs cannot be pinned on the language itself but instead the bugs will be pinned to each individual re-implementation of a method because the language did not allow for generality. There is a lot of problems with OffsetArrays.jl but most other languages do not even something like OffsetArrays.jl or the expectation that most code written would automatically work with custom indexes.

The conclusion of the article is a little muddy. The article is a personal recollection of facts related to a change of posture of the author, so it does not go out of its way to offer a solution and even admits that what is identified as a systemic problem may be unsolvable (maybe it is inherent to high generality?). The statement ``For the majority of use cases the Julia team wants to service, the risks are simply not worth the rewards.‘’ is probably the strongest claim in the article, and it is hard to rebut, but not because it is right but because it is too informal (what are “the majority of use cases the Julia team wants to service”, how are you doing this risk/reward analysis for each of them?). Everyone can only argue for their use case, for example, in my case I think the risk/reward is worth it, but the author of the article gets to make a blanket statement like this without really presenting this analysis on their article (again, it does not even compare metrics with other languages/frameworks, so the only really solid claim is that Julia has problems, not even that it is worst than others).

I agree partially on the generality problem, by this I mean: we could have what we have today (in terms of generality) but with less bugs. I do not think the fault is at the language design, a trade-off was made, and I like the trade-off (will not be the best for every use case, of course, but no language will). I think the problem is within the community, but not in the same sense that the author of the article implies. I believe the interfaces should remain (in technical terms) the same way they are right now, but better described by their authors, and should be the responsibility of each module proposing the use of some interface (including the ones at Base) to provide a test suite that checks invariants for an object of a type that implements the interface. If the object/type pass the test suite, but do not work with a function that assumes the object implements such interface, then the problem is within the function (it understands the interface incorrectly).


This prompted me to run a pool. Does OffsetArrays.jl correctly implement methods for eachindex given the contract specified in its docstring?

  • Yes
  • No
  • Some are correct and some are incorrect
  • Hard to tell
0 voters

Julia as an open ecosystem means that package authors never realize how much things outside are uncovered by their limited test cases. This is both good and bad – designing an interface concept, if achieved, might be harmful in general in the sense that people will be more restrictive to types and eventually make it a closed ecosystem.

And the flexibility of multiple dispatch makes it so easy to hit a no-man’s land with encrypted error messages that only experienced developer understand. It is often the case that you spend days tracing to the right dispatch route only to add or fix one small method. This is also the thing that I’ve seen too many Julia users (including myself) complain about. – This is, again, both good and bad.

The solution? I don’t know. More carefully written test cases, and more documentation to explain the design and educate the users, I guess. But many package authors don’t take test and docs seriously – people are often fooled by the code coverage and thought it’s near 100%. Absolutely no. Your code coverage might be far less than 10%, if speaking of all the valid input compositions.

Speaking of number of lines, Julia’s composibility and flexibility means that one often needs a 1:1 or even 1:3 src-test codes ratio only to ensure “most” of things work. – but most people only test one or two use cases, which is far away from enough. If you check OffsetArrays.jl codebase using cloc, you’ll find out approximately 900 lines in src and 2200 lines in test.

When I try to depend on a package that I don’t maintain, I often check how carefully the authors write tests. If the tests are not well-written, I’d refrain from using it – no matter how good it declares to be. On this, I really, really appreciate how @oxinabox write the tests in all packages she maintains (ChainRules even has accompanied test helper package ChainRulesTestUtils) and I always feel lucky that my first few Julia contributions were under her review.

Even if we did our best job (I tried very hard in writing tests when developing JuliaImages), we still always get suprising bug report when users don’t follow the design. For instance, I pass Array{<:Colorant} into Distances :see_no_evil:, which is a big suprise for stats people. And unless we start to be more restrictive on function type annotation, we have little to do – but we want to build a shared, open ecosystem.


Just to make my phrasing more clear: I meant that many packages do not expect custom index start and therefore do not work well with OffsetArrays.jl not that the package itself is implemented wrong.

But to answer your question, this depends on what OffsetArrays.jl has “opted into fast linear indexing”, as eachindex says:


Create an iterable object for visiting each index of an AbstractArray A in an efficient manner. For array types that have opted into fast linear indexing (like Array), this is simply the range 1:length(A). For other array types, return a specialized Cartesian range to efficiently index into the array with indices specified for every dimension. For other iterables, including strings and dictionaries, return an iterator object supporting arbitrary index types (e.g. unevenly spaced or non-integer indices).

On the other hand, I really dislike Base.eachindex definition, and I would prefer to have it changed.


We should normalize listing in docs and READMEs what compositions we have tested and perhaps also listing what compositions we know do not work.


Building on what @JohnnyChen94 with specific focus on this point:

While perhaps the blog post does not provide a single, well-honed argument directed towards the language, there are some very clear takeways. For one, it references legitimate shortcomings in “older and more mainstream” libraries in the Julia ecosystem. Sure, we do not expect new libraries to emerge fully-formed and ironclad, but bedrock infrastructural components with many years of development and (nominally) more eligible maintainers for continued development can be held to a higher standard.

Now, one challenge that has been brought up wrt issues around interop/composability is that we don’t know what we don’t know when it comes to users combining libraries in novel ways. While I empathize with that perspective, I would posit that in many cases we can either anticipate issues or have seen them before. In that light, things are not so hopeless and there are actionable things package maintainers can do:

  1. Make interop work correctly. This is currently constrained by not having glue packages, but in many cases one package will be a dep of another.
  2. If 1) is not possible, error and warn when a problematic combination of inputs are detected. An in-your-face message is difficult to miss!
  3. If runtime checks are not feasible, then @mkitti’s suggestion sounds great. 🔪 JAX - The Sharp Bits 🔪 — JAX documentation may provide some inspiration here. My only addition would be that the list should be accessible directly from the README and/or docs landing page.

All of the above are an improvement over the status quo of users scouring issues and discourse posts, often without a good set of keywords. Heck, even as a package maintainer I often find myself wasting half an hour here or there finding related issue reports. All this work does rest on the assumption that core libraries in the ecosystem have enough dev capacity (looking at number of contributors, commit and release frequency) to tackle it and are not suffering from a case of XKCD #2347. If that is not true for particular packages despite outward appearances, then perhaps there is a more fundamental issue at hand.


There is a quoted tweet

Is Julia really appropriate for high-assurance real-time control systems at this point? I find this unlikely given the immature static analysis tools and the difficulty of managing allocation/collection pauses.

The “two-language problem” that people usually talk about is really a “two requirements problem”: fast programming and fast throughput, which is sometimes addressed by Python and C++. There are reasons for using C++ other than “high throughput” and, for that matter, reasons for using Python other than “fast programming”.

I think as a community we should be careful to avoid pattern matching on “Python and C++” as a generic source of potential Julia users. Julia is primarily targeted at one use case of those languages, and users with other requirements will be disappointed by Julia.


So it seems most users agree that the implementation of eachindex is correct. However, the docstring requires that for arrays only two return values are allowed: either 1:length(A) (which clearly must not be returned for OffsetVector and is not returned) or “specialized Cartesian range” (which it does not return). While it makes sense that IdOffsetRange is returned the point of my question is the following:

  • we write a generic function accepting AbstractArray x;
  • inside the function we run Base.IndexStyle(typeof(x)) and for OffsetVector we get IndexLinear()
  • given eachindex docstring we expect we can assume that eachindex would return 1:length(x) - this assumption is wrong though.

My conclusion is - in accordance to what @Henrique_Becker has written - that the crucial problem is eachindex docstring. The issue is that IMHO we need to carefully review contracts that Base Julia functions provide to make sure they are precise and easy to understand for developers so that when they add methods to such functions they can correctly implement them. Unfortunately this task is quite hard to achieve in practice.


Even though I’m very enthusiastic about Julia in general, this hits home. I’ve personally run into the problem of ChainRules returning incorrect results. Luckily, I’d come across the various discussions about bugs in Julia’s AD frameworks before, so as soon as I saw any kind of misbehavior in my program, checking the correctness of the derivatives was the first thing I did. I was able to identify the bug relatively easily and submit a bugfix, but it still cost 3-4 full workdays, so not an ideal situation. I’m somewhat prepared to always test the correctness of any third-party library for my particular use case, but it does add overhead, and it’s very much as sign of an immature ecosystem.

Contrary to the OP, I don’t really think that these growing pains cannot be overcome. It will probably just take time. After all, Julia is still quite young. But it is a problem, and in terms of the entities that can provide funding (NumFocus, JuliaComputing) it would be good to try to address these issues consciously and directly. That probably means finding ways have full-time paid software engineers in some of the core packages of the ecosystem, and to invest in tooling.

Also, the place where I personally feel Julia’s relative immaturity most is the lack of effective linting/code analysis/testing tools. There’s a lot happening, but it doesn’t yet come anywhere close to the level of tooling that e.g. Python has (although it’s probably a lot better than Python’s tooling at 10 years of age). Right now, it’s very hard to find misspelled variables, unused code paths, issues like the incompatibility with OffsetArrays that the article mentions, etc. VSCode claims to have some tooling, but either I’m not setting it up correctly, or it just doesn’t work very well (plus, I very strongly prefer vim, so independent command-line tools would be preferable).


Yeah while I like Julia of course, the author has a point. I don’t think it has to do with the lack of interfaces, but rather that the mantra that packages will “just work” with a whole ecosystem of independently developed packages is often repeated by the community, which is simply not true.

Part of the reason Julia has been so successful is that such extreme composability is possible, but I think the main point brought up by this article is that the ecosystem tends to assume that disparate packages will be composable, and if they are not, then there will be some obvious error message.

In my experience, unless the maintainers of packages are both very experienced, and put in effort to ensure continued compatibility (as e.g. with SciML), composition regularly breaks. Fortunately the outcome of breakage is usually just an impenetrable error message, but as this blog points out, it can sometimes result in bugs that are very difficult to detect.

As other people have concluded, I agree that the primary issue is that testing practice in Julia comes from other languages that are far less flexible, and that is not suitable for the enormous range of possible behaviors that widely used code can see.