Julia 1.10-rc1 gives a huge speed-up over Julia 1.9

I am running 400 simulations, based on ModelingToolkit, each 11h of real-time.
With Julia 1.9, launching Julia with:

julia --project -i -q -t 16 

it takes 44 s (second run).
With Julia 1.10-rc1, launching Julia with:

julia --project -i -q -t 16 --gcthreads=8,1

it takes 25.3 s (second run).

This is a speed-up by a factor of 1.7 . Very nice!

The reason is two-fold:

  • single threaded execution had a speed-up by a factor of 1.1
  • multithreaded execution had a much bigger speed-up, mainly due to a reduction
    of the GC time from 60% to 40%.

@ChrisRackauckas Has multithreading of ModelingTookit simulations changed with Julia 1.10? I am using the same model, but run it again and again with different parameters. Under which conditions can this be used with multi-threading?

To answer my own question: Multi-threading only works with MKL. But that could be a separate topic.

This is what my core loop looks like:

    Δαs     = se.Δα_max:-0.5:se.Δα_min
    Δα_ests = se.Δα_max:-0.5:se.Δα_min
    # we need a sorted dictionary for the results because the execution order
    # of the threads is undefined
    dict = SortedDict{Float64, Vector{Float64}}()
    lk = ReentrantLock()
    Threads.@threads for Δα0 in Δαs
        println("Δα = $(Δα0)")
        Eps = ep2alpha(se, Δα_ests; Δα0=Δα0)
            # writing to a dictionary is not thread safe, we need a lock here
            dict[Δα0] = Eps

The GC has improved with multitheading if I’m not mistaken. And startup time is dramatically improved.


Can you try 1.10 but before rc1?

Curious to see what’s the impact of this PR

With Julia 1.10.0-beta3 I get the same results:

 25.739921 seconds (1.60 G allocations: 146.002 GiB, 38.21% gc time, 0.02% compilation time: 100% of which was recompilation)
1 Like

Summary of performance improvements of Julia 1.10.0-rc1 compared to 1.9.3

Speed up single threaded:
Factor 1.1

Speed up multi threaded (16 threads):
Factor 1.7

Speed up startup time
Factor 1.6

(My startup time includes loading all packages, but also loading some configuration data). To get the full multithreaded speed advantage you must launch Julia with parameters similar to these: --gcthreads=8,1.

My packages:

⌅ [aaaaaaaa] ControlSystemsBase v1.9.5
  [a93c6f00] DataFrames v1.6.1
  [82cc6244] DataInterpolations v4.5.0
  [864edb3b] DataStructures v0.18.15
  [4e289a0a] EnumX v1.0.4
  [7a1cc6ca] FFTW v1.7.1
  [cf66c380] FastChebInterp v1.2.0
  [a98d9a8b] Interpolations v0.14.7
  [033835bb] JLD2 v0.4.38
  [b964fa9f] LaTeXStrings v1.3.1
  [23992714] MAT v0.10.6
  [961ee093] ModelingToolkit v8.72.2
⌅ [1dea7af3] OrdinaryDiffEq v6.58.1
  [438e738f] PyCall v1.96.2
  [d330b81b] PyPlot v2.11.2
⌅ [295af30f] Revise v3.5.7
⌅ [f2b01f46] Roots v2.0.20
  [9672c7b4] SteadyStateDiffEq v1.16.1
  [21f18d07] Timers v0.1.5
  [ddb6d928] YAML v0.4.9