I was messing around with Threads.@thread as I am trying to parallelize a for loop : I want to compute taylorinteg from TaylorIntegration.jl several times, independently, for different initial conditions. To do this, I wrote a MWE reproducing the Kepler example in Jupyter Notebook Viewer

Here is the piece of code :

using TaylorIntegration

const μ = 1.0
const q0 = [0.19999999999999996, 0.0, 0.0, 3.0] # a initial condition for elliptical motion
const order = 28
const t0 = 0.0
const t_max = 10*(2π) # we are just taking a wild guess about the period ;)
const abs_tol = 1.0E-20
const steps = 500000

const r_p3d2 = TaylorSeries.Taylor1{Float64}

#the equations of motion for the Kepler problem:
function kepler!(dq, q, params, t)
    r_p3d2 = (q[1]^2+q[2]^2)^(3/2)
    dq[1] = q[3]
    dq[2] = q[4]
    dq[3] = -μ*q[1]/r_p3d2
    dq[4] = -μ*q[2]/r_p3d2

function task()
    t, _ = taylorinteg(kepler!, q0, t0, t_max, order, abs_tol, maxsteps=steps)
    return t[end]

function f_par(x)
    xn = zeros(x)
    Threads.@threads for i in 1:x
        xn[i] = task()
    return nothing

function f(x)
    xn = zeros(x)
    for i in 1:x
        xn[i] = task()
    return nothing

However, when I try it with @time, the parallelized version (I use 72 CPU hearts on the server) isn’t much faster, mainly because of GC time :

julia> @time f(1000)
 27.500834 seconds (198.44 M allocations: 55.672 GiB, 3.21% gc time)

julia> @time f_par(1000)
 26.797330 seconds (198.44 M allocations: 55.672 GiB, 76.07% gc time)

One more problem : the GC time varies greatly, seemingly randomly (3% to 67% for the non-parallelized version for instance). Can you tell if it is a consequence of server activity, my code, or intern to taylorinteg ?

(This is on Julia 1.6.5)

The TaylorIntegration library seem to do a lot of allocations:

julia> @time task()
  0.018372 seconds (198.46 k allocations: 52.447 MiB, 14.05% gc time)

With more threads running concurrently, there allocations / time will increase to the point where the GC basically cannot keep up. I think the library needs to be optimized a bit to reduce the number of allocations.

Do you have an idea about the instability of the GC time percentage ?

julia> @time f(10)
  0.431646 seconds (1.98 M allocations: 570.078 MiB)

julia> @time f(10)
  0.903906 seconds (1.98 M allocations: 570.078 MiB, 52.85% gc time)

julia> @time f(10)
  0.480541 seconds (1.98 M allocations: 570.078 MiB, 11.01% gc time)

The GC has various heuristics which relate to the age of objects and total memory allocated etc. So it isn’t too surprising that it varies between runs.

