Startup time of 1000 packages – 53% slower in Julia 1.12 vs 1.10

Sure, I’ve just been doing things like:

~$ hyperfine 'julia +1.11 --startup-file=no -e "using Base"' 'julia +1.11 --startup-file=no -e "using Downloads"'
Benchmark 1: julia +1.11 --startup-file=no -e "using Base"
  Time (mean ± σ):     130.6 ms ±   5.8 ms    [User: 143.8 ms, System: 119.6 ms]
  Range (min … max):   124.6 ms … 148.0 ms    21 runs
 
Benchmark 2: julia +1.11 --startup-file=no -e "using Downloads"
  Time (mean ± σ):     198.2 ms ±   5.0 ms    [User: 2270.8 ms, System: 142.3 ms]
  Range (min … max):   191.9 ms … 206.5 ms    14 runs
 
Summary
  julia +1.11 --startup-file=no -e "using Base" ran
    1.52 ± 0.08 times faster than julia +1.11 --startup-file=no -e "using Downloads"

Which I take to indicate that using Downloadstakes ~70ms in practice, as opposed to the ~40ms indicated by @time or ~200ms indicated by @time_imports.

5 Likes

Network options was removed from the sysimage so I think this is expected, High package load time/ allocations on Julia 1.11 · Issue #40 · JuliaLang/NetworkOptions.jl · GitHub.

4 Likes
ufechner@framework:~$ hyperfine 'julia +1.10 --startup-file=no -e "using Downloads"' 'julia +1.11 --startup-file=no -e "using Downloads"'
Benchmark 1: julia +1.10 --startup-file=no -e "using Downloads"
  Time (mean ± σ):     102.2 ms ±   1.5 ms    [User: 52.6 ms, System: 59.1 ms]
  Range (min … max):   100.1 ms … 105.0 ms    28 runs
 
Benchmark 2: julia +1.11 --startup-file=no -e "using Downloads"
  Time (mean ± σ):     114.8 ms ±   2.5 ms    [User: 290.9 ms, System: 49.2 ms]
  Range (min … max):   111.1 ms … 121.6 ms    26 runs
 
Summary
  julia +1.10 --startup-file=no -e "using Downloads" ran
    1.12 ± 0.03 times faster than julia +1.11 --startup-file=no -e "using Downloads"

And I get only 12% difference between Julia 1.11 and 1.10.

Because most of the time in that benchmark is from starting Julia itself.

2 Likes

I don’t think that is the reason. Even if I load a very large package where the startup time of Julia plays a very insignificant role I get similar results:

Benchmark 1: julia +1.10 --startup-file=no -e "using KiteModels"
  Time (mean ± σ):      5.530 s ±  0.015 s    [User: 5.168 s, System: 0.928 s]
  Range (min … max):    5.507 s …  5.549 s    10 runs
 
Benchmark 2: julia +1.11 --startup-file=no -e "using KiteModels"
  Time (mean ± σ):      6.256 s ±  0.015 s    [User: 6.307 s, System: 0.515 s]
  Range (min … max):    6.222 s …  6.273 s    10 runs
 
Summary
  julia +1.10 --startup-file=no -e "using KiteModels" ran
    1.13 ± 0.00 times faster than julia +1.11 --startup-file=no -e "using KiteModels"

In this example, Julia 1.10 is 13% faster than Julia 1.11. But this is only the package load time and does not include the pre-compilation time. Well, for me the load time is much more relevant than the pre-compilation time.

1 Like

Another data point. Time-to-first-gradient with Mooncake + DifferentiationInterface is 40% higher in Julia 1.11 vs 1.10:

> hyperfine --warmup 3 'julia +1.10 --startup-file=no -e "using Mooncake, DifferentiationInterface; f(x) = sum(x .^2); gradient(f, AutoMooncake(config=nothing), [1., 2., -1.4])"' 'julia --startup-file=no -e "using Mooncake, DifferentiationInterface; f(x) = sum(x .^2); gradient(f, AutoMooncake(config=nothing), [1., 2., -1.4])"'
Benchmark 1: julia +1.10 --startup-file=no -e "using Mooncake, DifferentiationInterface; f(x) = sum(x .^2); gradient(f, AutoMooncake(config=nothing), [1., 2., -1.4])"
  Time (mean ± σ):     43.036 s ±  0.159 s    [User: 42.316 s, System: 0.651 s]
  Range (min … max):   42.892 s … 43.365 s    10 runs
 
Benchmark 2: julia --startup-file=no -e "using Mooncake, DifferentiationInterface; f(x) = sum(x .^2); gradient(f, AutoMooncake(config=nothing), [1., 2., -1.4])"
  Time (mean ± σ):     60.154 s ±  0.400 s    [User: 59.122 s, System: 0.876 s]
  Range (min … max):   59.707 s … 61.109 s    10 runs
 
Summary
  julia +1.10 --startup-file=no -e "using Mooncake, DifferentiationInterface; f(x) = sum(x .^2); gradient(f, AutoMooncake(config=nothing), [1., 2., -1.4])" ran
    1.40 ± 0.01 times faster than julia --startup-file=no -e "using Mooncake, DifferentiationInterface; f(x) = sum(x .^2); gradient(f, AutoMooncake(config=nothing), [1., 2., -1.4])"

Here julia runs Julia 1.11.5. For some reason julia +1.11 says ERROR: 1.11 is not installed. Please run juliaup add 1.11 to install channel or version. I opened an issue about this.

As a side note, computing the first gradient of a basic quadratic function of three variables (3D vector) takes one minute??

1 Like

You have 1.11.5 installed in the release channel (arguably that could be improved - maybe open an issue in juliaup).

I find that much worse than load/compile time is the regression in TTFP (at least in this case) where 1.11 is ~2.5 slower and 1.12 ~4 time slower than 1.10. And that example is for an exact same command that has been pre-compiled with PrecompileTools.

8 Likes

Anybody have any idea why the user+system or user mean time can be larger, even >10x, than the actual time? I looked up the relevant hyperfine issues (links below) but the possible explanation there is multithreading, which I don’t know where it’s happening. How many cores are there and do imports actually use multiple even when the Julia process defaults to single-threaded?
Document hyperfine’s output fields · Issue #443 · sharkdp/hyperfine
Larger system time than runtime reported · Issue #515 · sharkdp/hyperfine

Measuring that by subtracting the using Base process’s runtime from the using Downloads process’s runtime (or rather the intervals) assumes that starting and exiting the process takes the same time. That seems reasonable for the start because of the identical state, but exiting doesn’t occur with the same state and can take an arbitrarily long time e.g. running finalizers julia --startup-file=no -E "obj = Ref(3); finalizer(x -> Libc.systemsleep(x[]), obj)". No idea how to benchmark exit() in isolation though.

Following up on some of the comments here regarding consistent code performance across Julia versions — I wanted to share some observations from our own experience with Rocket.jl, where we still support versions as far back as Julia 1.3.

We’re running exactly the same test suite and code across all supported Julia versions, with the same payload, and we’ve noticed similar patterns to those described by Fons. For example, we’ve seen test runtimes improve significantly between 1.3 and 1.9 — dropping from nearly 5 minutes to about 3.5 minutes. A great improvement! However, starting with 1.11, the performance regresses substantially — test time jumps back to 4.5 minutes, and the nightly build performs even worse than 1.3.

You can see an even more drastic version of this in the test suite for RxInfer.jl, where we support Julia 1.10 and above. There, the jump from 1.10 to 1.11 is dramatic.

Similar issues appear in our RxInferExamples.jl repo as well:

Seeing test runtimes go from ~17 minutes to ~23 minutes is quite concerning. The logs suggest that most of this additional time is spent in compilation too.

20 Likes

It’s pretty negligible. If you want to skip finalising, you can ccall(:exit, Cvoid, (Cint,), 0) for the sake of testing (POSIX only).

1 Like

@fonsp It would be interesting, if possible, to take that into account. If more deps of the packages you are installing require precompilation the setup time would directly be greater but hopefully you would get a corrisponding boost later on

My measurements use the latest versions of packages for all Julia versions, so PrecompileTools is running on Julia 1.10 and 1.11 as well.

In the dashboard, you can choose to only view precompile times:

or only the load time (i.e. @time import Example)

You can see that both of these are getting slower in Julia 1.12.

5 Likes

@fonsp you are more familiar with your code but could you maybe add the option to add a representative workflow for the package? Maybe for plot(sin) and then we could add workflows for other packages via a PR?

If there would be a base structure where we could just add it we could all help :slight_smile:
That would be amazing!

1 Like

It seems to me that there’s scope for a “TTFX tests” repository that aggregates TTFX code snippets for popular packages… I’ll happily spin one up if that’s of interest.

16 Likes

It’s ambiguous what this means. There’s at least a couple ways to read this:

  1. Across all Julia versions, only one version X of a given package Y is measured. If that version is not compatible with earlier Julia versions, it is omitted. That is what Symbolics.jl seems to do in that graph; its current version requires v1.10+, so its line only starts there

  2. The latest version of a given package Y is measured against and varies with a given Julia version Z. That is implied by the notebook:

:check_box_with_check: Include Julia versions before 1.10 . Comparing with old Julia versions might not be valid, since different package versions might be loaded.

Ok, seems there’s interest. I’ll spin something up.

For just collecting TTFX test snippets, is there any metadata that would be worth having besides the package and code snippet?

1 Like

That would be amazing! What do you have in mind, a kind of general registry with one (or more) file per package, and the maintainers provide the snippets?

We’d probably want:

  • a Project.toml listing the dependencies needed by the snippet, including compat bounds
  • the source / origin of the snippet (documentation, readme, developer, etc.)

Essentially, but I’m thinking we might as well make it open for anybody to add snippets. I’ve got an idea around this, and I’ll probably just see about implementing it then sharing it rather than describing it in full here.

I’ve been thinking about the Project.toml/Manifest.toml, however:

  1. I think each snippet should probably only involve a single package, to remain as focused as possible. Maybe a separate family of multi-package snippets if there’s demand/value.
  2. Given the different ways you might want to resolve package/dep versions, I feel like adding full manifest information is a bit much. I’m leaning towards just a minimum Julia + package version

Yup.

That may not be possible for some of the most used packages, which require a combination of dependencies to do anything at all. Some examples I have in mind:

  • DifferentiationInterface.jl goes hand in hand with an autodiff backend like ForwardDiff.jl
  • JuMP.jl is useless without a solver like HiGHS.jl
  • Turing.jl usually calls Distributions.jl

In addition, many snippets will involve standard libraries like LinearAlgebra.jl or SparseArrays.jl, which might be versioned separately in the future.

That’s why I’m learning towards at least a Project.toml. Even for your proposal of “just a minimum Julia + package version”, the standard way of encoding this is a Project.toml file.

3 Likes