Finding and fixing invalidations: now, everyone can help reduce time-to-first-plot

I just posted a new blog post on recent work reducing Julia’s latency, a.k.a., “time to first plot” and even “time to second plot (after loading more code)”. The blog post describes recent work on diagnosing and eliminating invalidations, events that cause Julia to have to recompile previously-compiled code. The blog post focuses on (1) explaining the underlying ideas and (2) briefly summarizing the overall progress we’ve made so far.

I’m writing this post to emphasize to package developers the third arm of this effort: the existence of tools that can be used to help resolve similar problems in your own packages. If you’re using Julia’s master branch you may already be getting some benefit from the improvements to the compiler, Base, and the standard libraries, but you can pitch in to help make it even better. This effort comes in essentially three flavors:

  • while Julia itself has made a lot of progress, there’s still more that can be done. Most of the remaining vulnerabilities seem to be in the area of string-processing, an area that hasn’t received much attention yet. Even simple packages like FilePathsBase still trigger hundreds of invalidations, and we could use one or more heroes to step forward and get this straightened out.

  • in areas where Julia is (or will become) near-bulletproof, the next frontier will be package interactions. The basic idea is you might load one package, and then the next package you load trashes some of the compilation work of the previous one. (This happens if packages do a fair amount of “work” as they initialize, or if you use them interactively and then load more packages.) If this matters to you, here there’s room for lots of heroes, because we have lots of packages. It’s also worth noting that invalidations have proven to be a smoking gun signaling opportunities for improving code, so you may well find such efforts rewarded with better runtime performance as well as lower latency.

  • one of the byproducts of reducing invalidations is that precompile statements work better than they used to, because the work of precompilation is not being invalidated as frequently as it once was. If you’ve not done so, during the ramp-up to 1.6 might be a good time to consider adding precompile files so that your packages let you start doing real work faster. This can help reduce inference time today. For the future, it’s possible that the work squashing invalidations will make it viable to precompile and cache “native” code, and that would entirely eliminate the cost of compilation when you first start executing code.

Now to the tools themselves. The first place to start is SnoopCompile, specifically the @snoopr macro and associated analysis code and interactive tools. It takes a little while to wrap your head around getting to know how to use it, but especially if you’re already at least a bit comfortable with reading the output of @code_warntype then you’ll master it very quickly. (I’ve posted a video of an interactive session that might help newcomers to this topic.) Following the trail of the invalidations quickly leads you to inferrability problems, and the remarkable Cthulhu (which now integrates with SnoopCompile) lets you figure out the origin of the problem very quickly.

A different tool in SnoopCompile, @snoopi, is useful for measuring how much time is spent on inference and also for setting up precompile statements that ensure your package lets you start doing real work sooner. There is even a bot that allows you to automate the process of precompile-file maintenance.

A more first-principles approach is MethodAnalysis, which can survey and answer questions about the “output” of Julia’s compiler, the various compiled methods. MethodAnalysis facilitates broad analysis (the “birds eye view of your compiled code”) as well as hunting down specific triggers of invalidation that are hard to discover by other means. I don’t recommend starting here, just because it’s more of a swiss army knife than a focused tool, but it has been quite useful so far.

If folks want to take a stab at this effort but find they need help, as time permits I’m happy to offer a bit coaching, either here or in Zulip or Slack.

80 Likes

This is very good work, thanks for tackling it. I’ve seen in the table that Makie went from 3273 to 129 invalidations from Julia 1.5 to master branch. That looks like a massive improvement without any work on our side, but can I gauge from that number alone how much time could be saved approximately? I don’t know if our compilation latency has a lot to do with the invalidation mechanism or if we just have too many imports / too much code. As a side note, if Plots becomes too fast due to all these improvements we could make Makie the next default target :wink:

3 Likes

Julia 1.5:

julia> @time using Makie
 12.187105 seconds (23.72 M allocations: 1.306 GiB, 3.79% gc time)

julia> @time display(plot(rand(5)))
 34.808714 seconds (77.81 M allocations: 3.968 GiB, 3.49% gc time)
GLMakie.Screen(...)

master branch:

julia> @time using Makie
  5.169340 seconds (12.24 M allocations: 839.493 MiB, 7.09% gc time)

julia> @time display(plot(rand(5)))
Internal error: encountered unexpected error in runtime:
[...]
 30.541731 seconds (72.74 M allocations: 4.232 GiB, 4.09% gc time)
GLMakie.Screen(...)

The “runtime” bug is here, and probably unrelated to the invalidation work (though you never know for sure until it’s fixed). So about a 2x speedup in loading time, some of which might be due to the invalidation work (note especially that the memory consumption is halved) but could also be due to some amazing work Jameson did in changing the way method tables are sorted.

For Makie, beyond the welcome halving of loading time the real opportunity is precompilation: I know that @SimonDanisch tried precompile directives generated by SnoopCompile and was disappointed by the results, but I have a much better understanding of why now: if Makie invalidates components of Base or the stdlibs that it relied on (and you can see from the massive number of invalidations it causes on 1.5 that it likely did), the precompilation would be predicted to be useless because Julia wouldn’t be able to use the results: you can’t use inference results on method m2 that depends on m1 if you’ve invalidated m1 just by loading Makie. That’s less likely to be true now. So I might recommend following the example of the Plots developers in their approach to precompilation and tell us all how it works out!

Edit: baseline instructions on precompile files are here, and the bot provides a mechanism for automatically updating your precompiles as your code changes.

7 Likes

Wow! I don’t have anything useful to add, just wanted to express my gratitude and let you know that the sheer amount of knowledge you have of Julia intrinsics is simply stunning.
All the hard work you and others put into it to tackle these difficult problems is definitely appreciated.

13 Likes

Nice post. It did leave me with a question though: Do methods get recompiled as soon as they are invalidated? If so, would it possible and fruitful to save time by delaying recompilation until the invalidated method is actually called?

1 Like

It already works that way: invalidation happens at time of method definition, compilation when you need it (because you’re using it).

There are circumstances where the recompilation happens almost immediately. Invalidations in the code-loading pathway are a particularly bad example. Julia comes shipped with the loading code precompiled, to make your first load much snappier. But suppose you say using PkgC, which depends on PkgB, which in turn depends on PkgA; in principle the following could happen (and sometimes does, particularly prior to the recent work on the master branch):

  1. load PkgA, which happens to invalidate the loading code
  2. attempt to load PkgB, which forces you to recompile the loading code
  3. now load PkgB, which unfortunately also invalidates the loading code
  4. attempt to load PkgC, which forces you to recompile the loading code…

In the worst case, loading n packages forces you to recompile the loading code n times.

Some have proposed “walling off” the loading code and making it impossible to invalidate, and there is a lot of merit to that. But IMO that won’t be necessary because we’re pretty close to having the loading code “bullet-proof” now; I think that there are only a dozen or so “bad” MethodInstances that it now depends on, almost all in the area of string processing. If someone tackles that area then I really don’t think we’ll have to worry about that happening anymore.

4 Likes

Is it possible to pin this post so it’s visible to all visitors on Julia Discourse?

5 Likes

Where can I read, what exaclty happens on package-related operations: install, build, precompile, using, and functions first and second calls?

Do i understand correct, that package devs are encouraged to work more on treir package precompilation that happens at build step, so many runtime problems would be solved, at least for standard Base types?

Can that package precompilation happen in “background”, e.g. when I first use plot(::Vector{Int}), then compiled code would be saved in some package-associated image, and next time I start Julia, using Plots and that function, it would be no compilation, even if that package does not have precompilation included?

1 Like

In the blog post, the example shows invalidations happening for a method called with an abstract type, as would be the case if there had been a type instability.

Are there other possible reasons for invalidation, or is there just a lot of type-unstable code out there?

Also, an earlier effort to reduce latency was to skip inference on various functions to avoid specializing when it wasn’t necessary (with @nospecialize). Wouldn’t this make the invalidation problem worse by making more methods susceptible to being invalidated?

Are there other possible reasons for invalidation, or is there just a lot of type-unstable code out there?

Piracy can cause invalidation, and indeed checking invalidations seems to be one of the better ways of detecting piracy you may not have known about. But in my experience most invalidations are triggered by poor inference. Sometimes that’s not surprising; there are good reasons that Pkg uses Dict{String,Any} for a lot of stuff, but it turns out that if you don’t want Pkg’s code to be invalidated you then later have to annotate some objects whose types you happen to know.

Also, an earlier effort to reduce latency was to skip inference on various functions to avoid specializing when it wasn’t necessary (with @nospecialize ). Wouldn’t this make the invalidation problem worse by making more methods susceptible to being invalidated?

Exactly right. Adding @nospecialize is often an extremely good idea for reducing latency, but what I’ve learned more recently is then you sometimes might want to make some adjustments by adding “safe” type annotations in places where it matters. That is, if you’re calling somemethod(i::Integer), and you’ve added @nospecialize annotations that prevent inference from knowing the exact type of i but you happen to know it’s really i::Int, you might want to write that call as somemethod(i::Int) (not in the signature of the method, but in the body of the caller).

5 Likes