Blog post: Julia's latency: Past, present and future

Dear all

I wrote a new blog post on Julia - this time on the history and future of Julia’s latency. Comments are welcome!
https://viralinstruction.com/posts/latency

48 Likes

Pre v1.0, focus was obviously on getting the semantics and API of the language right

One interesting thing that would be fun to reference here is that getting the semantics of the language right is one of the things that made latency worse. Issue #265 was perhaps the most infamous of Julia’s issue numbers, but its fix in v0.6 is precisely what introduced invalidations. But the connection between the #265 fix (a huge boon!) and latency (a huge boondoggle!) was lost on me — and I suspect many others — until Tim started on his latency hunts.

12 Likes

2: Avoid defining overly-broad methods

section - doesn’t it go against the common and recommended approach of only putting type constraints in methods when (potentially) needed for dispatch?

Not adding type annotations is fine. The issue is when you add new methods that are semantically too broad. In this example:

  1. adding constructors to whole groups of types you don’t know anything about
  2. Adding a convert method to types you don’t know the semantics of.
2 Likes

Since Julia v1.0 (and before), Julia already cached compiled code in two places: Code compiled during a session was cached in memory, and code compiled during package installation was cached to disk in a process known as precompilation

I thought that precompilation was only caching parsed code (or parsed + one extra step) but not actually fully-compiled code.

EDIT: read the whole post and saw this was mentionned at the end…

Prior to 1.9 it precompilation cached typed IR (after running type inference and Julia optimizations). In 1.9, it’s native code.

2 Likes

I just check the compiled folder and indeed, now on 1.9 there are ji and dll files. The ji files are much smaller than on 1.8, but in total the size increases by ~50% (for OrdinaryDiffEq at least).

It would be interesting to get insight into how this works & what is stored, maybe a blog post?

1 Like

Jeff’s 1.9 webinar talks a bit about this.

Oh man… New Jakob Nissen post just dropped. Thanks for sharing and eager to read!!! :smile:

3 Likes

Great post! You researched everything very carefully.

I have only one substantive correction: in Julia's latency: Past, present and future, using Cthulhu’s @descend is now (IMO) significantly easier than @code_warntype; don’t forget that reading type-inferred code is basically reading a foreign language, and the new Cthulhu avoids the need by presenting results woven into the source code as written by the programmer.

Another small point is that the reasons you cite for our interpreter being slow are all true, but there’s another one: so much of Julia is written in Julia, it means there is much more to interpret. It’s very easy for Python to interpret call_c_because_I_do_all_real_work_in_C(args...).

19 Likes

A mode where your code is interpreted but uses compiled package code would be very interesting.

10 Likes

Great post, thanks! These posts look more and more like advanced complements to the language manual :slight_smile: and a good starting point for a book.

2 Likes

Really well written! Having only a little understanding of latency and invalidations this was really interesting, and also taught me a lot. Well done

1 Like

You can fake this with push!(JuliaInterpreter.compiled_modules, Base, OtherModulesYouWantToRunCompiled...). Perhaps we should add an JuliaInterpreter.compiled_modules_except(mods...).

But riffing off your point, we could choose interpret vs compile based on what is already compiled. :thinking:

2 Likes

Indeed the post is a great overview of the work done on latency. I Don’t know any of the technical details but it gives an idea of how challenging it was to reach those great achievements.

Sorry if I go slightly off topic but since

Are there plans to replace the output of @code_warntype in base by what @descend gives? That would be so great for beginners

Not in the near future, it’s a pretty huge dependency stack.

It will become much easier when the ponies are delivered to my doorstep: aka, when JuliaSyntax is Julia’s parser, and when lowering has been rewritten to track location info in the same way JuliaSyntax does for parsing, then nearly everything needed will already be in Julia proper. So at that point, I think that’s something we should just do.

But in the short term, we should stop recommending @code_warntype and just recommend Cthulhu.

6 Likes

Thanks for the great explanation!

Thanks for the comments, I agree and have amended the post as suggested.

Also, the new Cthulhu is great as an advanced replacement of @code_warntype! Nice work.

3 Likes

I understand that the solution to issue #265 helps with interactive development in the REPL, but where else is this useful/necessary? The Julia community coped with non-redefinable types, so isn’t this similar to #265? Could we have a latency friendly Julia fork by sacrificing world-age?

The solution to #265 (aka, adding invalidation as a language feature) is needed if you want to support the following combination of properties:

  1. interactive development
  2. “method overloading” by packages that don’t own the function
  3. aggressive compilation
  4. consistent compilation: same result no matter how you got there

To illustrate: suppose you have a function f with one method,

f(::Any) = 1

and then write

g(list) = sum(f.(list))

Now let list be a Vector{Any}. You can compile a fast g(::Vector{Any}) (aggressive compilation) by leveraging the fact that you know there’s only one possible method of f, and you know what the output is: g(list) gives you just length(list). But now suppose you add a second method (interactive development + method overloading)

f(::MyObj) = 2

where MyObj is some new type you’ve defined (so it’s not type-piracy). If you want to get the right answer (consistent compilation) from an arbitrary list::Vector{Any}, there are only two options:

a) plan for this eventuality from the beginning, by making every f(::Any) be called by runtime dispatch. But if there really is only one method of f this is vastly slower, so this at least partly violates aggressive compilation.
b) throw away the code for g that you created when there was only one method of f, and recompile it in this new world where there are two.

Julia does a mix of these: we do b) up to 3 methods, and then a) thereafter.

Now, I’ve framed this as an experiment at the REPL, but keep in mind this is also relevant if you load two packages: PkgA might define f and g, and PkgB might define a second method of PkgA.f. Unless you want to defer all compilation (including for Base) until the entire session is loaded and then closed to further extension, you have to make the same choice between a) and b). The entire package ecosystem would collapse if we didn’t.

I’d argue that the combination of these four properties is a lot of what makes Julia what it is. AFAIK, Julia is the only language that supports all four of these properties. But that means we’re having to blaze new ground to figure out what the costs are and how to mitigate them.

20 Likes