Startup time of 1000 packages – 53% slower in Julia 1.12 vs 1.10

Startup time of 1000 packages

I think startup time is a big deal for Julia, especially for new users. If it takes a long time to load a package, it makes Julia look bad.

Recently, Julia 1.12 beta was released, and I noticed that the load times are higher than before. Talking to other package developers, this seemed like a shared experience, so I wrote a script to measure the install, precompile and load times of the 1000 most downloaded packages.

Dashboard

To visualize the results, I wrote a Pluto notebook (obviously :sweat_smile:), you can read it online here and play with the parameters yourself.

Here is a screenshot of the notebook, click to go to the interactive version:

screenshot of the notebook

No complete TTFX measurement

Unfortunately, these numbers do not tell the complete story, since it does not measure JIT compilation. This happens when using packages for the first time, e.g. the first time you call Plots.plot(data). To measure this, we would need a “representative workflow” for every package.

My take

In my view, package latency has gone up significantly in the last two Julia versions. The average time to install, precompile and load an environment with a popular package is:

  • 37% slower in Julia 1.11 compared to Julia 1.10
  • 53% slower in Julia 1.12 compared to Julia 1.10

This is the time it takes to install, (parallel) precompile and import a package, plus its dependencies, on a fresh installation of Julia. Average taken over the 1000 most downloaded packages.

My analysis takes installation and precompilation (let’s call this ‘setup’) into account, which not all TTFX discussions do. I believe that setup time is very important for new users, who might face setup times more often (trying new packages), and who might find it most surprising (coming from another ecosystem like Python or Matlab).

Your take

I am curious to hear what you think of the results! Be sure to play with the parameters in the notebook, edit the code, or run your own benchmarks.

Source code & data

The script to measure the package loading times is available here:

https://github.com/fonsp/package-loading-times

You can find the measurements as .json files here:

https://github.com/fonsp/package-loading-times/tree/results-v1

You can also get code to load the data from my notebook, see the dashboard for source.

45 Likes

Well, 1.12 is not yet released, so I expect some improvements to come before the final release.

Thank you @fonsp for sharing these concerns. I am also worried with these trends as I am constantly training beginners in Julia, and they spend a lot of time installing packages for the first time.

7 Likes

A big improvement would be if pre-compiled packages could be downloaded and installed without local pre-compilation. I think this will happen at some point in time.

12 Likes

Just a meta-comment, I think maybe the title and first paragraph should be edited to make it a bit more clear early on that you’re including the download, installation, and precompile times in your measurements.

This of course makes sense to measure for Pluto.jl usage where one ends up hitting that pathway very very often whenever making new notebooks, but it’s not what more julia users would typically think of when they hear someone talking about startup times (I at least thought this was about Time-To-Load an already precompiled package until I got to the end of your post).


Personally, I’m less concerned about the precompile times (so long as it’s used to reduce TTFX) than I am about the load times which are totally swamped in this measurement by the precompile times which are orders of magnitude greater. But unfortunately, load times have also been steadily rising since v1.10


On topic:

I think that the answer to getting down precompile times is going to have to be breaking compilation units into smaller pieces so you don’t have to precompile stuff you don’t actually need (and take greater advantage of parallelism). There’s a good github issue about this here: Monorepo Packages via Submodules/Subpackages: Support in Pkg and Julia Compiler · Issue #55516 · JuliaLang/julia · GitHub


Edit: somehow I missed that Fons did actually say his script measures those things in the top of the post and I just missed it. Sorry for the noise.

4 Likes

Hm, I have to admit I do not share this sentiment. First of all my impression is that download speed is practically instant. I’ve never noticed any meaningful amount of time to download. To install, perhaps? It is not clear to me what “install” even means in the context of Julia. I thought once the source code is downloaded the “installation” is the precompilation…?

In any case, in my typical usecases of Julia I am constantly switching between different projects with different dependencies and different versions. This triggers precompilation surprisingly frequently for reasons I never fully understood (so far I thought that once a package is precompiled it stays as such forever even if you later precompile a newer version, but I guess that’s not the case?). Anyways I am saying this just to provide a data sample of someone that is definitely constantly affected by precompile times. I also notice them to be much longer than TTFX.


EDIT: Thanks @fonsp for setting this code up!!!

8 Likes

Just to be clear, I’m not saying precompilation times are not important. They absolutely are.

I’m just saying that most people might not read “Startup time” and think “ah of course that includes the download and precompile times”, so the relevant definition should maybe be in the problem statement, or at least accompanying the graph, not down at the bottom.


Edit: somehow I missed that he did actually say his script measures those things in the problem statement. Ignore me, sorry for the noise.

1 Like

Same, it seems to be very unpredictable sometimes.
However, one reason could be that Julia only stores 10 different precompilation files for a package by default. In my case, I set export JULIA_MAX_NUM_PRECOMPILE_FILES=50 to have more files stored.

Also shameless plug: I assembled PrecompileAfterUpdate.jl which avoids the unexpected precompilation after Julia updates. It crawls through your recent manifests and precompiles then whenever you want (for example after a Julia update)

11 Likes

This triggers precompilation surprisingly frequently for reasons I never fully understood

Maybe it is just unlikely that all transitive dependencies are the same across projects that have the same top-level dependencies? If some low-level dependency then has a different version, all downstream packages need to be recompiled (I believe). This plus the limit of 10 versions already mentioned.

Is there any way we can cache and distribute the results of precompilation? I know that some packages invalidate methods in other packages, but is there a way this can work in the common or restricted case?

I think it should serve as much as precompilation directives of individual packages help now, which is a lot.

Yeah. I personally don’t use 1.11 or 1.12 even though I have them installed. 1.10 just feels much better.

4 Likes

Me too. And this why I think pushing juliaup that just installs latest is so annoying. I know we can install other versions but first time users do not know that they will be lead into a worst experience that it has to be.

1 Like

Different environments can result in different versions of dependencies for a given package and version (AFAIK dependency resolution in practice also isn’t deterministic for a given project, and Pkg isn’t an exception), and that forces precompilation of the package itself as a result (say a method inlines a dependency’s method that varies across versions). If you’re just activate-ing active environments (environments with Pkg-tracked manifests) without update-ing or add-ing anything though, I would expect what you did within the cache number limit per package. It might also help to make environments inactive by deleting tracked manifests, or better yet untracking them with PkgCleanup.jl.

I have seen code with many small allocations run 5x faster on julia 1.11 compared to 1.10, so the compiler is doing more work at optimizing some cases.

4 Likes

Is there also a way to quantify and control for the size of the code base (lines, methods, precompiled methods)? While longer load times could obviously be a good (and routine for AOT libraries) tradeoff for reduced JIT compilation in normal usage (which you pointed out is subjective), it could also just be an increase in features across versions. If those packages are in fact the same versions across Julia versions, then this question would be moot, but they’re probably not.

Seems like the way forward, though I’d actually prefer being able to write the shorter using OrdinaryDiffEqCore, OrdinaryDiffEqTsit5 instead of using OrdinaryDiffEq: OrdinaryDiffEqCore, OrdinaryDiffEqTsit5, with the clearer implication that I only installed and need to load those 2. Not sure, maybe the wider OrdinaryDiffEq could be just another bigger dependent that happens to be tied to the dependencies in a monorepo and releases.

1 Like

I think it’s worth noting that multiple issues are tracking this across a number of concrete cases, and the compiler team is very well aware of the problem. My own uninformed spidey sense is that there are two major effects here:

  • 1.10 was the culmination of a major effort to reduce TTFP and it was hyper optimized to avoid invalidations and the like. 1.11 and 1.12 haven’t seen as much love on that front yet.
  • More packages started using PrecompileTools, pushing yet more into package precompilation.
7 Likes

Just as an anecdotal extra, I’ve got a few packages I work on that I am very focused on trying to make as quick to load and run as possible. Across these, I’ve seen and battled with large regressions on 1.11 and 1.12. Looking at --trace-compile it seems like this is some part caused by a large increase in invalidations, some of which occur in really hard to understand ways.

For example, BaseDirs.jl is 100% concretely typed with no dependencies. In 1.9 and/or 1.10 (I forget which) loading and using it in another package is near-instant with no invalidations. However, in 1.11/1.12 I see surprising invalidations (only when loaded by another package) that despite extensive trial and error I have been unable to understand or resolve.

I’ve also come across other strange invalidations/(re)compilation appearing in other places. For instance, using Downloads takes <1ms in 1.10, but ~40ms in 1.11. Using @time_imports using Downloads shows 130ms to load NetworkOptions, 97% compilation time. On a related note, adding up all the times from @time_imports will often get a value much larger than @time using. I’ve taken to using a difference of hyperfine measurements to actually get load-time results I can trust.

I suspect the large increases that Fons sees here are connected to these sorts of issues I’ve been encountering, as well as the increased precompilation that Matt notes (which surely compounds the harm of invalidation, when code from dependencies is needlessly recompiled in precompilation tasks).

13 Likes

Would you be willing to expand a bit more on this out of curiosity? What is your method here when you are attempting get load-time results – do you have a mini “load-time” code snippet to show this workflow? Thanks!

Oh man I never noticed, this is so weird. NetworkOptions spending so much time precompiling isn’t too surprising given its latest release predates native caching, and package loading varies across sessions especially if some dependencies were already loaded, but it really doesn’t add up (first lines in different REPL sessions):

julia> @time using Downloads
  0.074386 seconds (66.94 k allocations: 3.905 MiB)
julia> @time using NetworkOptions
  0.009690 seconds (6.05 k allocations: 391.415 KiB)
julia> @time_imports using Downloads # really wish it printed a total
               ┌ 0.0 ms NetworkOptions.__init__()
    161.1 ms  NetworkOptions 97.77% compilation time
      9.2 ms  ArgTools
               ┌ 0.3 ms nghttp2_jll.__init__()
      3.6 ms  nghttp2_jll
               ┌ 1.7 ms LibCURL_jll.__init__()
      4.1 ms  LibCURL_jll
               ┌ 0.0 ms MozillaCACerts_jll.__init__()
      3.9 ms  MozillaCACerts_jll
               ┌ 0.0 ms LibCURL.__init__()
      1.9 ms  LibCURL
               ┌ 0.4 ms Downloads.Curl.__init__()
     21.4 ms  Downloads

julia> (161.1+9.2+3.6+4.1+3.9+1.9+21.4)/1000 # total in seconds
0.2052