Worse runtimes in Julia v1.12

The runtimes are getting worse in Julia v1.12.

I have a couple of tests in GeoStatsFunctions.jl that are now failing:

CompositeVariogram: Test Failed at /home/runner/work/GeoStatsFunctions.jl/GeoStatsFunctions.jl/test/theoretical/composite.jl:102
  Expression: #= /home/runner/work/GeoStatsFunctions.jl/GeoStatsFunctions.jl/test/theoretical/composite.jl:102 =# @elapsed(sill(γ)) < 1.0e-5
   Evaluated: 0.007924943 < 1.0e-5

The @elapsed test is placed after a warmup call, so it is not measuring compilation:

I don’t know what else I can do to preserve the performance of packages I’ve written a long time ago. I believe all of them contain idiomatic Julia code, follow best practices, etc.

2 Likes

I don’t think it’s true that the runtimes are getting worse in 1.12. The compiler is generally getting better over time, with better inference and more optimisations.
Performance regressions hit every language, although Julia is uniquely vulnerable to performance regressions among high-performance languages due to how Julia, by design, seamlessly mixes slow, dynamic code with fast statically inferred code. That makes it easy for the latter to slip into the former category undetected.

However, every release has some performance regressions which may be due to either small compiler details, or some changes in Base. It’s almost certainly one of these.

In this case, it would be helpful if you could drill down to what, precisely, is slowing down. Create a smaller, copy-pasteable example. Is something suddenly allocating? No longer SIMD’ing? These things are usually solvable.

If your question is broader: “How can I prevent performance regressions?”, then the answer is that there is no way - in any language - you can prevent them. But also, that Julia is expected to have a higher baseline of performance regressions.

3 Likes

Isn’t the MWE linked above enough to reproduce?

The GitHub Actions also show this consistent slow-down across all platforms:

It’s enough to reproduce, but there is some work needed to drill down to what is causing the regression among all the code in your package and all its dependencies.

What would be needed in order to pin down why the regression happened is some small example, ideally dependency-free, that can be copy-pasted directly into a REPL.

1 Like

Is there a real regression? I see this:

julia> [@elapsed sill(γ) for _ ∈ 1:10]
10-element Vector{Float64}:
 5.2e-6
 2.0e-7
 1.0e-7
 0.0
 0.0
 0.0
 0.0
 1.0e-7
 0.0
 1.0e-7

julia> versioninfo()
Julia Version 1.12.0

and

julia> [@elapsed sill(γ) for _ ∈ 1:10]
10-element Vector{Float64}:
 2.0e-6
 1.0e-7
 0.0
 0.0
 1.0e-7
 0.0
 1.0e-7
 0.0
 1.0e-7
 0.0

julia> versioninfo()
Julia Version 1.11.7

so maybe the test is just a bit flaky and too dependent on what runner it’s on?

@btime gives an allocation-free sub 2ns result on both 1.11 and 1.12, so looks like the function which if I see correctly is just

sill(f::CompositeFunction) = sum(f.cs .* map(sill, f.fs))

gets constant folded.

4 Likes

Profile on available platforms and identify code or dependencies to patch, I suppose. There are no performance guarantees across versions of source code, I don’t think that’s even feasible until you’re talking about a particular environment’s binaries on tested platforms. At the language level, you’re not only vulnerable to changes in the language implementation as this implies, you’re at the mercy of all of your dependencies. LoopVectorization eventually got some funding to be maintained for at least a while longer, but it was nearly deprecated for Julia v1.11 to the detriment of all the high-performance packages that relied on its loop auto-vectorization. Nobody sets out to cause performance regressions, but we do need people and resources to spot and fix anything.

2 Likes

We had some conversations on Zulip where some people started to suspect @elapsed is buggy, and could be leaking some extra time measurements that are not captured in BenchmarkTools.jl

Given that this regression is not reproduced locally in @nilshg and others’s machines, I wonder if it has to do with GitHub setup or @elapsed itself.

1 Like

More evidence in another package (TableTransforms.jl):

Actions are taking 2-3x more time to finish in Julia v1.12.

2 Likes

In my experience v1.12 CI is faster than v1.11 CI, and almost as fast as v1.10 CI. Obviously performance will depend on the workload. Not sure if it’s useful to bring up such “evidence” if you’re not going to dig into what is causing it.

The CI wall-time is surely a mix of Startup time of 1000 packages – 53% slower in Julia 1.12 vs 1.10 - #81 by ufechner7 (where you’re already participating) and runtime.

The most valuable thing here are concrete and isolated cases. Let’s focus this on those concrete examples where you can isolate a runtime regression.

2 Likes

@juliohm TableTransforms.jl isn’t using GitHub - julia-actions/cache: A shortcut action to cache Julia artifacts, packages, and registries.

I recommend trying that out.

4 Likes

We missed the cache on this repo. Thanks for catching @ianshmean :+1:

1 Like

Numbers don’t improve much with the cache action:

If you just look at the list of precompile timings, you can explain most of the difference. The test run time is just 40s longer on 1.12 ubuntu vs. lts ubuntu. Sure that‘s slower and worth looking into, but not 2x.

I have mentioned that before (though did not show a screenshot with the times). GMT CI times are the double for Julia 1.12 compared to 1.10 (12 min vs 24 min), but the big jump actually ocuured at Julia 1.11.

We are now talking about a completely different thing from what this thread started with. Is the original note about the runtime of 1.12 resolved and this thread has pivoted to end to end CI time?

1 Like

The elapsed tests posted in the beginning of this thread are still failing in Julia v1.12 after warmup. So I don’t think the thread is solved.

Other issues related to CI are also relevant.

Your assumption that @elapsed is an accurate way to measure the runtime of a function is wrong.

@btime is already better. Nevertheless, the performance of GitHub runners can vary widely.

Other question: Are you sure that code coverage is disabled when you run your performance tests? Code coverage testing can increase code execution time a lot in an unpredictable way.

8 Likes

Okay, but you are running @elapsed on a shared compute resource (CI). The way to report a performance regression is to basically do:

  • post the code (ideally can be copy pasted)
  • run it with BenchmarkTools
  • show the results on one julia version compared to another
  • show that the newer julia version is worse

An @elapsed call in some CI log will not gather any interest.

4 Likes

I could certainly replace @elapsed by @btime and will probably do that when I find some time, however, it will just hide a performance regression, which can be the result of various things as you said (CI, codecov, etc.).

Suppose Julia v1.12 runtimes have improved compared to previous versions of the language, which is a reasonable hypothesis. Still, the net effect we are seeing across various packages is undeniable: tests take twice the time to finish with Julia v1.12.

Anyone can copy/paste the MWE from the test suite and run in their own machines with Julia v1.10, v1.11 and v1.12. If the @btime results match, then we can attribute the issue to the @elapsed implementation across Julia versions or to the interaction of Julia with GitHub CI and codecov tools.

4 Likes