Startup time of 1000 packages – 53% slower in Julia 1.12 vs 1.10

jeez, @jules , give me a minute :smiley:

I modified the script to make sure the csv file is saved as well. Finished doing that 3 minutes ago, came to discourse to post that the aggregate data is now also available on the repo, and see that this was already announced 2 minutes ago by you.

I wish my students were as magically efficient and quick as this community (they are great anyway though)

9 Likes

PRs to the master branch that make ttfx_snippets_vis.jl better (called in make_and_commit_plots.sh) would be appreciated.

Haha I didn’t know you had just set it up a minute ago, thanks for making that data available!

1 Like

Here is the detailed runner info:

Summary

Ubuntu 2024 LTS

128GB ram

AMD Ryzen 7 5700G

1 Like

I was thinking something like this:

Code
pdd = CSV.read(Downloads.download("https://raw.githubusercontent.com/JuliaEcosystemBenchmarks/julia-ecosystem-benchmarks/refs/heads/jeb_logs/data/Julia-TTFX-Snippets/ttfx_snippets_data.csv"), DataFrame)

pddg = combine(groupby(pdd, [:package_name, :julia_version]), :precompile_time .=> [minimum, mean, median, maximum])
	
pgn = pddg[pddg.package_name .== "BaseDirs", :]
sort!(pgn, :julia_version)
barplot(pgn.precompile_time_mean, label = "Mean",
    axis = (; xticks = (1:nrow(pgn), pgn.julia_version), xticklabelrotation = π/4,
		 xlabel = "Julia version", ylabel = "Precompile time (seconds)")		
	)
rangebars!(1:nrow(pgn), pgn.precompile_time_minimum, pgn.precompile_time_maximum, color = :red, linewidth = 1, whiskerwidth = 10,
			  label = "Min-max range")
axislegend()
current_figure()
7 Likes

That is great, here are a few comments:

  • The named versions alpha, lts, etc, probably need to be filtered out or changed to include only the last handful of results because lts includes all versions that were historically lts, not just the current lts (some for the other named versions)
  • Median might be better than mean because there are a ton of extreme outliers
  • If someone gets around to submitting this as a PR it will be accepted quickly and will be present in the autogenerated plots (see ttfx_snippets_vis.jl)
3 Likes

maybe it would make sense to store commit hashes or something for the changing specifiers? Otherwise it gets a bit complicated to map them back I guess

1 Like

That is available deep in the logs and can be added to the csv (at some point).

The discussion seems to be burying the lead a bit - from the plot it looks like 1.12 does not have a major regression compared to 1.10, correct?

3 Likes

If I look at my CI jobs:

They need 26 min with Julia 1.10 and 31.5 min with Julia 1.12. A difference of 20%. But this is, of course, only one measurement, not necessarily representative.

1 Like

And when I do the same comparison with GMT it takes ~12-13 minutes with 1.10 and 24-25 with 1.11. 1.12 and nightly.

And this despite significant improvements achieved after important reductions in invalidations on 1.12.

1 Like

Did you create an issue for this regression at GitHub · Where software is built ?

I just switched over our benchmark CI to 1.12. Over 1.12, Import is a 10% improvement, precompile is a slight regression, and TTFX (compiling an empty kernel) takes 55% longer

1 Like

For DifferentiationInterface, the regression can be really visible in my current CI runs (chore: switch to Runic formatting from JuliaFormatter (#871) · JuliaDiff/DifferentiationInterface.jl@7a87b5f · GitHub):

  • SimpleFiniteDiff backend:
    • 1.10: 58m
    • 1.11: 1h19
    • 1.12: 1h27
  • ForwardDiff backend:
    • 1.10: 43m
    • 1.11: 59m
    • 1.12: 55m
3 Likes

Not particularly this one but for example #59494, where with the very helpful contibutions of Kristoffer and Tim the TTFP of the plot function dropped down to the amazing:

julia> @time using GMT
  0.707362 seconds (1.11 M allocations: 64.533 MiB, 5.06% gc time, 3.01% compilation time)

julia> @time @eval plot(rand(5,2))
  0.002897 seconds (721 allocations: 34.281 KiB)

This is very likely fastest TTFP of a plot() function in Julia. But unfortunately this does not propagate and a call that is almost identical in called functions (it differes to only one line)

julia> @time @eval bar(1:5, (20, 35, 30, 35, 27), width=0.5, color=:lightblue, limits=(0.5,5.5,0,40))
  0.625434 seconds (2.13 M allocations: 108.216 MiB, 6.82% gc time, 98.59% compilation time)

compiles again. Sure, I can add this one too to the pre-compile workload but just that extra line adds 500 kb to the pre compiled cache. The size of the pre-compiled cache is what stops me to add much more to PrecompileTools workload. I have codes with a couple hundreds LOC that add ~10 MB to the pre-compile cache. I really don’t understand what happens in this land, only that they are huge.

2 Likes

What you describe here is probably not directly related to the compile times of packages. I am still looking for a simple, small, reproducible test case for the compile time that shows an increase of more than 20%. Then we would have the chance to create an issue that might get addressed.

This is likely influenced by how you specialize to kwargs in your code.
In makie we try do convert to untyped dicts for kwargs as early in the call as possible, so that all inner functions don’t specialize to the kwargs.
I’m not sure if that helps with gmt, but without it I’m sure you’ll get larger precompile times with slightly different kwargs!

2 Likes

I think that what you mean is more or less what I do here

The _common_plot_xyz() is the big function that does most of the parsing work. I still have to keep a Dict{Symbol, Any} because input can take different forms. But one thing that puzzles me is that other functions that rely almost ti calls to this main function still compile sub-functions that should have been already precompiled (at least they do not show up as invalidations with SnoopCompile)

Not just startup times are increasing. The runtimes are getting worse too. I have a couple of tests in GeoStatsFunctions.jl that are now failing in Julia v1.12:

CompositeVariogram: Test Failed at /home/runner/work/GeoStatsFunctions.jl/GeoStatsFunctions.jl/test/theoretical/composite.jl:102
  Expression: #= /home/runner/work/GeoStatsFunctions.jl/GeoStatsFunctions.jl/test/theoretical/composite.jl:102 =# @elapsed(sill(γ)) < 1.0e-5
   Evaluated: 0.007924943 < 1.0e-5

I didn’t change a single line of code :worried:

P.S.: the elapsed test is placed after a warmup call, so it is not measuring compilation.

2 Likes

Oh that’s quite servere, some should look into it.