Startup time of 1000 packages – 53% slower in Julia 1.12 vs 1.10

tecosaur · April 24, 2025, 5:22am

Sure, I’ve just been doing things like:

~$ hyperfine 'julia +1.11 --startup-file=no -e "using Base"' 'julia +1.11 --startup-file=no -e "using Downloads"'
Benchmark 1: julia +1.11 --startup-file=no -e "using Base"
  Time (mean ± σ):     130.6 ms ±   5.8 ms    [User: 143.8 ms, System: 119.6 ms]
  Range (min … max):   124.6 ms … 148.0 ms    21 runs
 
Benchmark 2: julia +1.11 --startup-file=no -e "using Downloads"
  Time (mean ± σ):     198.2 ms ±   5.0 ms    [User: 2270.8 ms, System: 142.3 ms]
  Range (min … max):   191.9 ms … 206.5 ms    14 runs
 
Summary
  julia +1.11 --startup-file=no -e "using Base" ran
    1.52 ± 0.08 times faster than julia +1.11 --startup-file=no -e "using Downloads"

Which I take to indicate that using Downloadstakes ~70ms in practice, as opposed to the ~40ms indicated by @time or ~200ms indicated by @time_imports.

Zentrik · April 24, 2025, 9:27am

Network options was removed from the sysimage so I think this is expected, High package load time/ allocations on Julia 1.11 · Issue #40 · JuliaLang/NetworkOptions.jl · GitHub.

ufechner7 · April 24, 2025, 10:18am

ufechner@framework:~$ hyperfine 'julia +1.10 --startup-file=no -e "using Downloads"' 'julia +1.11 --startup-file=no -e "using Downloads"'
Benchmark 1: julia +1.10 --startup-file=no -e "using Downloads"
  Time (mean ± σ):     102.2 ms ±   1.5 ms    [User: 52.6 ms, System: 59.1 ms]
  Range (min … max):   100.1 ms … 105.0 ms    28 runs
 
Benchmark 2: julia +1.11 --startup-file=no -e "using Downloads"
  Time (mean ± σ):     114.8 ms ±   2.5 ms    [User: 290.9 ms, System: 49.2 ms]
  Range (min … max):   111.1 ms … 121.6 ms    26 runs
 
Summary
  julia +1.10 --startup-file=no -e "using Downloads" ran
    1.12 ± 0.03 times faster than julia +1.11 --startup-file=no -e "using Downloads"

And I get only 12% difference between Julia 1.11 and 1.10.

kristoffer.carlsson · April 24, 2025, 11:03am

Because most of the time in that benchmark is from starting Julia itself.

ufechner7 · April 24, 2025, 11:18am

I don’t think that is the reason. Even if I load a very large package where the startup time of Julia plays a very insignificant role I get similar results:

Benchmark 1: julia +1.10 --startup-file=no -e "using KiteModels"
  Time (mean ± σ):      5.530 s ±  0.015 s    [User: 5.168 s, System: 0.928 s]
  Range (min … max):    5.507 s …  5.549 s    10 runs
 
Benchmark 2: julia +1.11 --startup-file=no -e "using KiteModels"
  Time (mean ± σ):      6.256 s ±  0.015 s    [User: 6.307 s, System: 0.515 s]
  Range (min … max):    6.222 s …  6.273 s    10 runs
 
Summary
  julia +1.10 --startup-file=no -e "using KiteModels" ran
    1.13 ± 0.00 times faster than julia +1.11 --startup-file=no -e "using KiteModels"

In this example, Julia 1.10 is 13% faster than Julia 1.11. But this is only the package load time and does not include the pre-compilation time. Well, for me the load time is much more relevant than the pre-compilation time.

ForceBru · April 24, 2025, 11:31am

Another data point. Time-to-first-gradient with Mooncake + DifferentiationInterface is 40% higher in Julia 1.11 vs 1.10:

> hyperfine --warmup 3 'julia +1.10 --startup-file=no -e "using Mooncake, DifferentiationInterface; f(x) = sum(x .^2); gradient(f, AutoMooncake(config=nothing), [1., 2., -1.4])"' 'julia --startup-file=no -e "using Mooncake, DifferentiationInterface; f(x) = sum(x .^2); gradient(f, AutoMooncake(config=nothing), [1., 2., -1.4])"'
Benchmark 1: julia +1.10 --startup-file=no -e "using Mooncake, DifferentiationInterface; f(x) = sum(x .^2); gradient(f, AutoMooncake(config=nothing), [1., 2., -1.4])"
  Time (mean ± σ):     43.036 s ±  0.159 s    [User: 42.316 s, System: 0.651 s]
  Range (min … max):   42.892 s … 43.365 s    10 runs
 
Benchmark 2: julia --startup-file=no -e "using Mooncake, DifferentiationInterface; f(x) = sum(x .^2); gradient(f, AutoMooncake(config=nothing), [1., 2., -1.4])"
  Time (mean ± σ):     60.154 s ±  0.400 s    [User: 59.122 s, System: 0.876 s]
  Range (min … max):   59.707 s … 61.109 s    10 runs
 
Summary
  julia +1.10 --startup-file=no -e "using Mooncake, DifferentiationInterface; f(x) = sum(x .^2); gradient(f, AutoMooncake(config=nothing), [1., 2., -1.4])" ran
    1.40 ± 0.01 times faster than julia --startup-file=no -e "using Mooncake, DifferentiationInterface; f(x) = sum(x .^2); gradient(f, AutoMooncake(config=nothing), [1., 2., -1.4])"

Here julia runs Julia 1.11.5. For some reason julia +1.11 says ERROR: 1.11 is not installed. Please run juliaup add 1.11 to install channel or version. I opened an issue about this.

As a side note, computing the first gradient of a basic quadratic function of three variables (3D vector) takes one minute??

lmiq · April 24, 2025, 11:33am

You have 1.11.5 installed in the release channel (arguably that could be improved - maybe open an issue in juliaup).

joa-quim · April 24, 2025, 12:26pm

I find that much worse than load/compile time is the regression in TTFP (at least in this case) where 1.11 is ~2.5 slower and 1.12 ~4 time slower than 1.10. And that example is for an exact same command that has been pre-compiled with PrecompileTools.

Benny · April 24, 2025, 1:34pm

Anybody have any idea why the user+system or user mean time can be larger, even >10x, than the actual time? I looked up the relevant hyperfine issues (links below) but the possible explanation there is multithreading, which I don’t know where it’s happening. How many cores are there and do imports actually use multiple even when the Julia process defaults to single-threaded?
Document hyperfine’s output fields · Issue #443 · sharkdp/hyperfine
Larger system time than runtime reported · Issue #515 · sharkdp/hyperfine

Measuring that by subtracting the using Base process’s runtime from the using Downloads process’s runtime (or rather the intervals) assumes that starting and exiting the process takes the same time. That seems reasonable for the start because of the identical state, but exiting doesn’t occur with the same state and can take an arbitrarily long time e.g. running finalizers julia --startup-file=no -E "obj = Ref(3); finalizer(x -> Libc.systemsleep(x[]), obj)". No idea how to benchmark exit() in isolation though.

bvdmitri · April 24, 2025, 2:32pm

Following up on some of the comments here regarding consistent code performance across Julia versions — I wanted to share some observations from our own experience with Rocket.jl, where we still support versions as far back as Julia 1.3.

We’re running exactly the same test suite and code across all supported Julia versions, with the same payload, and we’ve noticed similar patterns to those described by Fons. For example, we’ve seen test runtimes improve significantly between 1.3 and 1.9 — dropping from nearly 5 minutes to about 3.5 minutes. A great improvement! However, starting with 1.11, the performance regresses substantially — test time jumps back to 4.5 minutes, and the nightly build performs even worse than 1.3.

You can see an even more drastic version of this in the test suite for RxInfer.jl, where we support Julia 1.10 and above. There, the jump from 1.10 to 1.11 is dramatic.

Similar issues appear in our RxInferExamples.jl repo as well:

Seeing test runtimes go from ~17 minutes to ~23 minutes is quite concerning. The logs suggest that most of this additional time is spent in compilation too.

tecosaur · April 24, 2025, 6:26pm

It’s pretty negligible. If you want to skip finalising, you can ccall(:exit, Cvoid, (Cint,), 0) for the sake of testing (POSIX only).

VinceNeede · April 27, 2025, 7:21pm

@fonsp It would be interesting, if possible, to take that into account. If more deps of the packages you are installing require precompilation the setup time would directly be greater but hopefully you would get a corrisponding boost later on

fonsp · April 28, 2025, 8:52am

My measurements use the latest versions of packages for all Julia versions, so PrecompileTools is running on Julia 1.10 and 1.11 as well.

In the dashboard, you can choose to only view precompile times:

or only the load time (i.e. @time import Example)

You can see that both of these are getting slower in Julia 1.12.

roflmaostc · April 28, 2025, 9:21am

@fonsp you are more familiar with your code but could you maybe add the option to add a representative workflow for the package? Maybe for plot(sin) and then we could add workflows for other packages via a PR?

If there would be a base structure where we could just add it we could all help
That would be amazing!

tecosaur · April 28, 2025, 9:34am

It seems to me that there’s scope for a “TTFX tests” repository that aggregates TTFX code snippets for popular packages… I’ll happily spin one up if that’s of interest.

Benny · April 28, 2025, 9:35am

It’s ambiguous what this means. There’s at least a couple ways to read this:

Across all Julia versions, only one version X of a given package Y is measured. If that version is not compatible with earlier Julia versions, it is omitted. That is what Symbolics.jl seems to do in that graph; its current version requires v1.10+, so its line only starts there
The latest version of a given package Y is measured against and varies with a given Julia version Z. That is implied by the notebook:

Include Julia versions before 1.10 . Comparing with old Julia versions might not be valid, since different package versions might be loaded.

tecosaur · April 28, 2025, 10:36am

Ok, seems there’s interest. I’ll spin something up.

For just collecting TTFX test snippets, is there any metadata that would be worth having besides the package and code snippet?

gdalle · April 28, 2025, 11:13am

That would be amazing! What do you have in mind, a kind of general registry with one (or more) file per package, and the maintainers provide the snippets?

We’d probably want:

a Project.toml listing the dependencies needed by the snippet, including compat bounds
the source / origin of the snippet (documentation, readme, developer, etc.)

tecosaur · April 28, 2025, 11:31am

Essentially, but I’m thinking we might as well make it open for anybody to add snippets. I’ve got an idea around this, and I’ll probably just see about implementing it then sharing it rather than describing it in full here.

I’ve been thinking about the Project.toml/Manifest.toml, however:

I think each snippet should probably only involve a single package, to remain as focused as possible. Maybe a separate family of multi-package snippets if there’s demand/value.
Given the different ways you might want to resolve package/dep versions, I feel like adding full manifest information is a bit much. I’m leaning towards just a minimum Julia + package version

Yup.

gdalle · April 28, 2025, 12:03pm

That may not be possible for some of the most used packages, which require a combination of dependencies to do anything at all. Some examples I have in mind:

DifferentiationInterface.jl goes hand in hand with an autodiff backend like ForwardDiff.jl
JuMP.jl is useless without a solver like HiGHS.jl
Turing.jl usually calls Distributions.jl

In addition, many snippets will involve standard libraries like LinearAlgebra.jl or SparseArrays.jl, which might be versioned separately in the future.

That’s why I’m learning towards at least a Project.toml. Even for your proposal of “just a minimum Julia + package version”, the standard way of encoding this is a Project.toml file.

Topic		Replies	Views
Taking TTFX seriously: Can we make common packages faster to load and use Performance ttfp	125	11746	June 20, 2022
Any way to speed up loading large precompiled packages? General Usage precompilation , ttfp , ttfx	35	3211	May 15, 2023
First Pluto notebook launches are slower on Julia 1.9 beta 3 Internals & Design performance , pluto , ttfx	157	8707	February 10, 2023
10-15 minute TTFP with Plots.jl... Please help New to Julia ttfp	55	2877	January 9, 2023
How to help reduce package load latency? New to Julia	38	2801	December 31, 2023

Startup time of 1000 packages – 53% slower in Julia 1.12 vs 1.10

Related topics