"A Tragedy of Julia’s Type System"

(I’m not the author of this article.)

Wow that’s quite the tragedy indeed! My primary response to the author’s fears is that these inferred unions can disappear just as easily as they appear with a something() or a branch like x === nothing && error(). Sure, you need to handle it to avoid snowballing type instabilities til the sky falls, but hey, that’s exactly what us programmers do as a fundamental part of our daily work, right? It also looks like we’re due to nominate @MilesCranmer for a Turing award thanks to DispatchDoctor.jl :slight_smile:

20 Likes

Defining findindex as follows:

findindex(arr, val) = 
  isnothing((x = findfirst(==(val), arr);)) ? error("oh no!") : x

Gives a version of findindex tailored to the case where value is known to be in arr. Julia is smart enough to make this a type stable function and thus optimize the rest of the code appropriately.
This implementation is robust to a bug with arguments which don’t honour the contract (giving a nice error), but other implementations of same semantics might not do that.
In any case, Julia seems better than other languages mentioned in this instance with respect to this issue. Julia is not free from criticism, but it isn’t clear the linked post points to a technical problem.

I think the author has summarized their (very valid) point of view nicely in the end of the article:

Disclaimer : I left the Julia community two years ago to work on AI and large language models in Python, mainly because I lost interest in programming language theory and my hope for improving Julia. In the AI era, I believe no reasonable person should waste time on this unpromising language. Seeing Llama.c and Cradle in Rust, even Go having Ollama, while Julia — a language that claims to excel at numerical computing — has produced no equivalent. Something must be wrong. The community remains tragically immersed in unrealistic fantasies, believing everything will improve with TTFP and AOT (which won’t happen, as analyzed above). For a language that doesn’t even support distributed training (llama.c supports it flawlessly), what difference would AOT make? It’s all too late.

I think clearly there’s significance gap between Julia and Python in ML, the author seems convinced Julia is worthless at least in traditional AI realm: x.com

I don’t think we need to spill much ink in this discourse over this, people can have their priority and Julia certainly isn’t best-fit for everything.

14 Likes

Posts like “I quit Julia because,” especially when some of the reasons are valid, end up hindering the language’s adoption and growth.

5 Likes

This post is also on Hacker News. I’ll just copy-paste my response from there:

Very cool article! I think it does a good job of diving into the tradeoffs of a dynamic type system versus a static one.

I guess the conclusion depends on what priorities you come to the table with. If your starting point is that 1) code must be optimally fast (so no unions are acceptable), and that 2) programmers cannot be trusted to keep track of the types they are using, and so must always explicitly opt-in to union-like types (like e.g. Rust does), then yes, you will naturally conclude that implicit unions are a bad design.

However, if your starting point is the observation that scientists and engineers overwhelmingly prefer dynamic languages for data analysis, because all the boilerplate caused by forcibly, explicit type handling, then you naturally conclude that anything other than explicit union-types are a complete non-starter for a language for technical computing.

One could write a just as legitimate and valid blog post about the ‘tradegy’ of Rust’s enum types, highlighting how they inevitably lead to boilerplate and extensibility issues.

I would also argue against the author’s claim that it’s unrealistically hard to reach type safety (really: avoiding unions) in Julia. If you write a Julia function with good performance, it will continue to have good performance in the future - union types will not show up spontaneously.
Sure, performance can regress if you refactor - just like in any other programming language. So you need to benchmark - like in every other language.

34 Likes

It’s a little disappointing to see a former forum participant ignore all the ways that inferred Unions are semantically dealt with, in order to make the argument that inferred type stability is too hard in a dynamically typed language then tacking on the logical leap that this is the core reason for other languages having better support for a particular LLM. Labelling this and (semantically identifiable!) unsafe code in Rust, a systems language, as tragedies is telling.

3 Likes

I guess I get some of the worry in this article, though disagreeing with the conclusions/severity of the issue, but I struggled to take it as seriously after reading

This means writing type-safe Julia programs in 2025 is about as challenging as writing assembly with punch cards in the 1980s — manual function-by-function inspection is pretty much the only option.

because… really? Seems a bit absurd.

I also don’t care at all about AI or LLM stuff to comment on much of that part, but language like

In the AI era, I believe no reasonable person should waste time on this unpromising language

just reads like someone frustrated and venting about the language rather than something worth taking seriously.

14 Likes

Would it make sense for Julia to have optional types and implicitly unwrapped optional types as Swift does?
https://docs.swift.org/swift-book/documentation/the-swift-programming-language/types#Optional-Type

1 Like

Field/element types can make optional types in Julia (and people have) with less syntactic sugar (well, maybe macros can make something like if let), but it won’t make a substantial change in usage besides how we write nothing vs nil handling. The fundamental difference is Swift always needs this in the language-level type system because it’s statically typed.

1 Like

I understand and share some of the author’s concerns about the exploding possibilities of type instability. However, I wouldn’t go as far as them to let such frustration turn into an articulated doom post. I do think there are some valid takeaways from the post, as many commented above.

Personally, I wouldn’t say the Union-type system alone is what makes type inferencing hard with respect to the complexity of a program in Julia. It is the extremely “powerful and deadly” combination of Union-type and multiple dispatch (even more so from the latter).

In my opinion, multiple dispatch has become a double-edged sword that requires delicate care from Julia developers. On the one hand, it provides unmatchable customizability of combing codes from different parts of your codebase (or different libraries) to provide functions/programs with nontrivial new features. On the other hand, the binding of these methods through multiple dispatch also propagates the type of instability from each node of your overall computation graph.

In general, the type stability of your top-level wrapper function is almost only as robust as the “weakest link” of your chains of functions in terms of type stability. This is true for any language that requires type-checking. However, with multiple dispatches, this problem is only aggravated by providing users with more freedom to combine methods from different codebases. Essentially, Julia solves the “two+ language problem” by creating a single language with complex data type structures.

Thus, I don’t think the conflict between high expressivity and low type instability of a program is universally resolvable, even if we were to redesign Julia nowadays (or any dynamic PL that implements JIT and multiple dispatches). Though I’m not optimistic enough to think there will be such a solution, I have been willing to trade the ease of stabilizing types with unparalleled expressivity from the multiple dispatch system in Julia.

What a more constructive/helpful direction to look at regarding this whole issue may not be how to eliminate the possibility of creating type-instable functions at the language feature level (the author of this article, along with many other Julia disapprovers, seem to share such obsession), but to see whether the language can be improved to provide systematically easier or more manageable ways to confine the pollution of returned Union types from one method to another.

Personally, I believe the key lies in giving the users more control over the type parameters and instances of the composite types. And I think we already implicitly have such a need for the purpose of small-binary compilation. The more control you have to limit the number of methods tied to the specific combinations of argument type signatures, the smaller a binary executable you can have. And often, the number of type instances explodes because of unbounded type parameters.

3 Likes

Please note that this generally not true! Julia has the great feature that type instabilities are generally not contagious with the sole exception being if you end up with a container of mixed elements.
As soon as you call another function and pass the “unstable” value in a dynamic dispatch happens and then everything is stable on the other side. That’s exactly what a function barrier is.

In that I find the blog post rather contrived. To have a performant Julia code only the innermost hot loops need to be type stable and you usually can achieve that almost automatically just by putting it in its own function.

13 Likes
  1. Dynamic dispatch in Julia is very slow, especially compared to run-time switch statements (e.g., if-else). This is one of the reasons why it’s advised to avoid excessive specialization of parametric types in addition to unstable return types. In the original post, the author also reported that Julia’s dynamic dispatch scheme is even five times slower than Python’s. You cannot simply rely on dynamic dispatch when your function returns type-unstable container objects (e.g., Array).
  2. The very existence of function barriers is to mitigate the leaking effect of type instability by explicitly creating efficient core functions that can be compiled and optimized individually by the compiler. This actually proves my (and the author’s) point because “Union-type pollution” is the natural consequence of naive compositions of functions in Julia, which can dramatically degrade performance. If you want to claim:

The reality should be the other way around, where people do not need to manually create function barriers in the first place.

Furthermore, you can also read the specific paragraph under the function barrier code example, where the official documentation explains why we want to create a function barrier manually for that case. It is precisely because the compiler failed to make type inference when only one part of the original strange_twos is type unstable, not the other part that can be isolated as a “function barrier” (fill_twos!):

Julia’s compiler specializes code for argument types at function boundaries, so in the original implementation it does not know the type of a during the loop (since it is chosen randomly). Therefore the second version is generally faster since the inner loop can be recompiled as part of fill_twos! for different types of a .

The article is missing an obvious solution to the issue (I’d say that’s because it’s a pretty usual FUD that tends to be unconstructive on purpose, but opinions may vary):

Julia embeds a whole working compiler of Julia, so we can easily embed and run pieces of code that are typechecked by a different algorithm into Julia and use them moreless transparently. Having an alternate typesystem implemented that way provides relatively good compatibility with existing Julia code, gives migration paths, makes users able to employ a stricter typesystem only for the code where it makes sense, and the size of such a project is not “rewrite the whole Julia” (by far!).

The recent JuliaSyntax.jl adoption made implementation of such Julia extensions relatively easly, so I started working on a PoC for this. The main issue now seems to be in embedding the type inference results into Julia code. (Apparently, there’s no really straightforward way to tell the compiler that we really really know that a method (which we know too!) with fixed known argtypes is going to return some exact type…I was annoying people in #internal with that recently. :sweat_smile:) This is bothersome because it may create runtime inefficiency, but that’s it – everything else seems to “translate” well. So I hope to have a PoC of full strict inference, typeclasses, ADTs with patterns and similar handy stuff sometime later this year.

(Opinions very welcome at this point.)

11 Likes

I think we are talking past each other somewhat.

I only objected to your statement that “multiple dispatch [also] propagates the type of instability” because generally dispatch boundaries do not propagate type instability. So Unions are usually not a problem (there is also the union-splitting optimization which transforms small unions upto 4 posiibilties into exactly the if/else you also mentioned).

An exception are abstractly typed containers like Vector{Number}. These are themselves a concrete type and thus cannot be resolved by dispatch. So these propagate through your program. You correctly point out that runtime dispatch is rather slow, so performing repeated accesses to abstractly typed containers is indeed a performance pitfall. However in my experience avoiding these abstractly typed containers in practice is rather easy if you pay a bit of attention. If you really cannot avoid these, then you still have other options like using LightSumTypes.jl which converts the runtime dispatch beack to if/else statements.

I think this contradicts itself or is not precise enough in its terminology. If you “naively compose functions” then each function call is a function barrier that dissolves all Unions into a concrete type. If the union is small then the cost is that of an if/else statement. Thus to stay with the example from the blogpost findindex(array, elem) returning nothing or Int is not a problem and will just be dealt with by union-splitting. It only becomes a problem if you put the results of these functions into a Vector because the Vector cannot remember the individual types of its elements.

Another word on this perhaps: Speaking from my own experience (4 years working with Julia on most days): If you write idiomatic Julia code and structure your code well, then you don’t have any issues with type instability.
If you do weird stuff, like abusing a Vector to kind of build a recursive datastructure for which it is not really intendend then you will surely find performance issues. It is after all possible to write slow code in any language. For me the great advantage of Julia is twofold: First, it is very easy to optimize only the part of the code that is important to optimize (because you spend the majority of the runtime there). Secondly, getting most of the performance is rather easy (cf. function barrier) while the code stays readable.

3 Likes

I remember this user from Discourse and elsewhere[1]. They had some good ideas and insights into the design of the language, especially areas around type stability and how to scale the language to accommodate large, well typed codebases. It’s sad to see them gone and left with a bad experience, but such is the nature of having a large, public language community.

My recollection is that the mismatch came from wanting a different vision for Julia. I don’t think it’s controversial to say the language currently favours a more exploratory, additive workflow where one is often tweaking different levels of the stack, testing (and discarding) ideas, etc. Whereas especially for a lot of ML use cases like those mentioned in the article, what you want is more of a “platform” language which lets you do a few things and do them well with minimal hassle.

There are absolutely trade-offs on these two dimensions. Having lots of strict semantic rules such as a complex type system can offer more confidence about code after it’s written, but it also might prevent that code from being written in the first place. Likewise, having a language like Julia that supports very “porous” abstractions (for lack of a better term, note that I explicitly did not write “leaky”) can make optimizing up and down the stack easier. But it also means that issues can more easily propagate across the stack because the boundaries between components are hazier. It’s not the best example, but I think the tension between our default aggressive cross-function/module/package inlining and invalidation/world-splitting (i.e. controlling dynamic dispatch)/static compilation is one example.

I meant at some point to write at length about this disparity between platform vs analysis code, but it was going to be too much work. For now, let me just use it to opine on why people seem to talk past each other in these discussions:

I’d say there is a gradient here. The more one writes analysis or numerical code, the more this is true. Julia eats this kind of code for breakfast!

But the more code has to focus on other concerns like messy string manipulation, handling of arbitrary user input, working with dynamic data formats, etc, the more the pendulum swings away from the usual performance tips. Function barriers and such can help, but they are far from a panacea. I find “platform” code seems to experience more of this, for whatever reason: perhaps it’s that non-numeric domain modelling just has a level of inherent complexity which necessitates it. Think most of the Julia web frameworks, some of our static analysis tooling, and even Core.Compiler itself (which is nowhere close to 100% type stable).

In the past, this dichotomy has been used to claim the Julia community doesn’t care about “software engineering”. I don’t necessarily agree with that framing, but perhaps it’s worth asking how much we care about both ends of the spectrum. If the answer is that most everyone is focused on the numerical/analysis end, that’s great! But then we should not be surprised when people with different opinions on this make their opinions known.


  1. I will not be linking their posts because they intentionally left the forum. However, it should not be difficult to find what they posted. ↩︎

13 Likes

There are some issues you might run into, even with purely numerical code.

I recently spent three full days trying to understand a type instability that in the end turned out to occur due to capturing a variable in an anonymous function. I still have no intuition about why it happens, but simply creating a new variable local_x = x and then capturing local_x instead of x completely solved the problem (x was a scalar and immutable value).

I still feel a bit annoyed and perplexed about this, and it seemed to violate my mental model “a variable is just a label for a value”.

8 Likes

I don’t know how true this is, the author certainly didn’t share a benchmark. Theoretically, dispatching on a tuple of multiple types is more complicated than dispatching on a single type; on top of that, AOT compilers know all of the concrete types that the program can dispatch to. But it’s actually extremely difficult to benchmark the runtime dispatch itself without getting thrown off by nearby code (like the function being called). Python is also not AOT-compiled, and its single dispatch is not a fixed vtable, but dynamically sized dictionaries.

A couple weeks ago, I benchmarked a dynamic dispatch of +, the LLVM IR seems close enough to it anyway. It added 14.6ns to 2ns integer addition; function barriers are incredible for flexibility and interactivity, but we don’t want to go through them this frequently unless each call takes much much longer.

I tried to do something similar in Python, but I wasn’t sure how to benchmark without call overheads. So I just benchmarked accessing the bound method prior to the call:

>>> class A:
...     def foo(self): pass
...
>>> a = A()
>>> import timeit
>>> timeit.timeit("a.foo", globals=globals(), number =1000000) # seconds
0.17925379995722324
>>> timeit.timeit("a", globals=globals(), number =1000000)
0.04503010003827512

So 179ns per method access, 134ns if we don’t have to look up a. Seems like a lot, so I checked a plainer dictionary access:

>>> timeit.timeit("a['foo']", setup="a={'foo':print}", number=1000000)
0.07713829993735999

So 77.1ns, which doesn’t match. Then I remembered that each method call instantiates a bound method forwarding to the actual method in the class. That can’t be optimized away at (bytecode-)compile-time either because foo can be reassigned in the class or the instance any time.

Of course this doesn’t address how the performance scales, and it’s hard to guess without knowledge of what dispatch actually does. But the speed of even a fraction of CPython’s single dispatch seems greatly exaggerated.

4 Likes

A simple counter-example (tested on Julia 1.11.2):

julia> f1(a::Float64) = a + 1.0
f1 (generic function with 1 method)

julia> f1(a::Int) = Bool(a % 2)
f1 (generic function with 2 methods)

julia> f1(a::Vector{<:Union{Float64, Int}}) = f1.(a)
f1 (generic function with 3 methods)

julia> f2(v::Vector{<:Union{Float64, Int, Vector{<:Union{Int, Float64}}}}) = (v .|> f1 .|> first) .+ 1.0
f2 (generic function with 1 method)

julia> v = Union{Float64, Int, Vector{<:Union{Int, Float64}}}[1, 1.0, [1.0, 1], [1, 1], Union{Int, Float64}[1.0, 1]]
5-element Vector{Union{Float64, Int64, Vector{<:Union{Float64, Int64}}}}:
 1
 1.0
  [1.0, 1.0]
  [1, 1]
  Union{Float64, Int64}[1.0, 1]

julia> f2(v)
5-element Vector{Float64}:
 2.0
 3.0
 3.0
 2.0
 3.0

julia> Base.return_types(f2, (typeof(v),))
1-element Vector{Any}:
 Any

julia> @code_warntype f2(v)
MethodInstance for f2(::Vector{Union{Float64, Int64, Vector{<:Union{Float64, Int64}}}})
  from f2(v::Vector{<:Union{Float64, Int64, Vector{<:Union{Float64, Int64}}}}) @ Main REPL[4]:1
Arguments
  #self#::Core.Const(Main.f2)
  v::Vector{Union{Float64, Int64, Vector{<:Union{Float64, Int64}}}}
Body::AbstractVector
1 ─ %1 = Main.:+::Core.Const(+)
│   %2 = Main.:|>::Core.Const(|>)
│   %3 = Main.:|>::Core.Const(|>)
│   %4 = Base.broadcasted(%3, v, Main.f1)::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(|>), Tuple{Vector{Union{Float64, Int64, Vector{<:Union{Float64, Int64}}}}, Base.RefValue{typeof(f1)}}}
│   %5 = Main.first::Core.Const(first)
│   %6 = Base.broadcasted(%2, %4, %5)::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(|>), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(|>), Tuple{Vector{Union{Float64, Int64, Vector{<:Union{Float64, Int64}}}}, Base.RefValue{typeof(f1)}}}, Base.RefValue{typeof(first)}}}
│   %7 = Base.broadcasted(%1, %6, 1.0)::Core.PartialStruct(Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(+), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(|>), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(|>), Tuple{Vector{Union{Float64, Int64, Vector{<:Union{Float64, Int64}}}}, Base.RefValue{typeof(f1)}}}, Base.RefValue{typeof(first)}}}, Float64}}, Any[Core.Const(Base.Broadcast.DefaultArrayStyle{1}()), Core.Const(+), Core.PartialStruct(Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(|>), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(|>), Tuple{Vector{Union{Float64, Int64, Vector{<:Union{Float64, Int64}}}}, Base.RefValue{typeof(f1)}}}, Base.RefValue{typeof(first)}}}, Float64}, Any[Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(|>), Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1}, Nothing, typeof(|>), Tuple{Vector{Union{Float64, Int64, Vector{<:Union{Float64, Int64}}}}, Base.RefValue{typeof(f1)}}}, Base.RefValue{typeof(first)}}}, Core.Const(1.0)]), Nothing])
│   %8 = Base.materialize(%7)::AbstractVector
└──      return %8

I can hardly agree with you when the union-splitting technique only supports up to four (or three in some cases) distinctive types compared to the unlimited possible combinations of function argument types that potentially affect the return types. More ironically, the code I showed in 1.'s counter-example only covered a combination of two primitive types, Float64, Int, and a concrete composite type, Vector (along with an uninstantiable bottom type, Union{}). Yet, the compiler of the last Julia 1.11.2 still failed.

Please don’t generalize your own experience as the global developing experience of Julia, especially if you primarily apply a single coding paradigm for your domain-specific applications. Even within the field of quantum physics or quantum information science, there are different goals and needs for code development. Also, IMHO, “4 years daily with Julia” is not so long an experience with Julia considering that 2020 was long after the release of Julia 1.0.

If you are happy with how Julia is currently and have no problems using it in your daily work, I’m happy for you. I’m not claiming Julia’s type-complexity issue is a deadly sin that negates the entire value of the language. But when facing the fact that we have repeatedly seen similar posts complaining about related problems when the authors are genuinely trying to use Julia for real-world applications, either industrial or academic, I don’t think it’s beneficial to always respond to them with “It’s not that big of a deal in my personal experience.

5 Likes

For completeness, you reminded me that we had a previous thread about this. The best apples-to-apples benchmark I could come up with showed dynamic dispatch perf being pretty comparable: Does Julia Create a "1.5" Language Problem? - #115 by ToucheSir.

In practice, there is a lot that can shift this balance:

  1. The number of dispatch targets
  2. The complexity of the signatures being dispatched to
  3. The pattern of dispatch (i.e. are you getting a high cache hit rate)
  4. The complexity and nature of the argument and return types (i.e. how much overhead is caused by boxing)

I agree the picture does not look as bleak as the article suggests, but I would not take the other extreme and say dynamic dispatch is super optimized in Julia either. After finishing the comparison in the original thread, I also remember feeling like Python internals are too opaque to make good perf comparisons and it might be more productive to compare against languages with more explicit virtual method mechanisms (C++, Rust, C#, etc) instead.

But this goes back to the point about community priorities. Whether unintentional or caused by design choices, how much do we care about dynamic dispatch? I suspect the lack of interest comes from our focus on numerical and analysis code which has to deal with it less often, but it does still come up in places like the JuliaData ecosystem. I of course also have my personal opinion on whether we should be giving this more weight, but ultimately the decision is a communal one driven by the people working on the language and the ecosystem.

2 Likes