Integrator AutoVern9(Rodas5) take a lot of time DifferentialEquations.jl

I have systems:

function HR!(du, u, p, t)
    
    function sigma(x)
            return 1.0 / ( 1.0 + exp( -10.0 * ( x  - ( - 0.25 ) ) ) )
    end
    
    a, b, c, d, s, xr, r,  I, vs, k1, k2, el_link  = p
    x1, y1, z1, x2, y2, z2 = u
    
    du[1] = y1 + b * x1 ^ 2 - a * x1 ^3 - z1 + I - k1 * ( x1 - vs ) * sigma(x2) + el_link * ( x2 - x1 )
    du[2] = c - d * x1 ^2 - y1
    du[3] = r * ( s * ( x1 - xr ) - z1 )
    
    du[4] = y2 + b * x2 ^ 2 - a * x2 ^3 - z2 + I - k2 * ( x2 - vs ) * sigma(x1) + el_link * ( x1 - x2 )
    du[5] = c - d * x2 ^2 - y2
    du[6] = r * ( s * ( x2 - xr ) - z2 )
    
    return SVector(du[1], du[2], du[3],
                    du[4], du[5], du[6])
end

and parametres:

u0 = [ 0., 0.2, 0.3,
        0., 0.5, 0.6]
tspan = (0., 100000.)

a = 1.
b = 3.
c = 1.
d = 5.
xr = -1.6
r = 0.01 # 0.01
s = 5.
I = 4.
xv = 2.

k1= -0.17
k2 = k1
k = 0.0

Code for solve system:

p = [a, b, c, d,
        s, xr, r, I, xv, k1, k2, k]
prob = ODEProblem(HR!, [-1.5, 0.0, 0.0, -2.5, 0.0, 0.0], tspan, p)
 sol = solve(prob, AutoVern9(Rodas5()), abstol = 1e-11, reltol = 1e-11, maxiters = 10000000)

I need use time 500.000 or 1.000.000 but it is very long. I waited about 10 min and then abort code. How can i optimize time of calculating?

I change parametre k in cycle from 0.0 before 0.2 with step 0.001 and every iteration integrator compilation

Cycle:

k_space = range(0., 0.2, step = 0.001)
for (i, k) in enumerate(k_space)
    if i == 1
        global initialcondition =  [-1.5, 0.0, 0.0, -2.5, 0.0, 0.0]

    end
    
    p = [a, b, c, d,
        s, xr, r, I, xv, k1, k2, k]
    prob = ODEProblem(HR!, initialcondition, tspan, p)
    sol = solve(prob, AutoVern9(Rodas5()), abstol = 1e-11, reltol = 1e-11, maxiters = 10000000)
    initialcondition = sol[length(sol.u)]
1 Like

I tried using @inbounds and it didn’t help me much.
System with inbounds:

function HR!(du, u, p, t)
    
    function sigma(x)
            return 1.0 / ( 1.0 + exp( -10.0 * ( x  - ( - 0.25 ) ) ) )
    end
    @inbounds begin
    a, b, c, d, s, xr, r,  I, vs, k1, k2, el_link  = p
    x1, y1, z1, x2, y2, z2 = u
    
    du[1] = y1 + b * x1 ^ 2 - a * x1 ^3 - z1 + I - k1 * ( x1 - vs ) * sigma(x2) + el_link * ( x2 - x1 )
    du[2] = c - d * x1 ^2 - y1
    du[3] = r * ( s * ( x1 - xr ) - z1 )
    
    du[4] = y2 + b * x2 ^ 2 - a * x2 ^3 - z2 + I - k2 * ( x2 - vs ) * sigma(x1) + el_link * ( x1 - x2 )
    du[5] = c - d * x2 ^2 - y2
    du[6] = r * ( s * ( x2 - xr ) - z2 )
    end
    return SVector(du[1], du[2], du[3],
                    du[4], du[5], du[6])
end

How i understand compilate integator take a lot of time but how optimize it

I test integrator with 300.000 time and it work, similarly with 350.000 but if i take 400.00 or above integrator freezes. When i take 300.000 and 350.000 i don’t use loop for parametr

Try this:

function sigma(x)
         return 1.0 / ( 1.0 + exp( -10.0 * ( x  - ( - 0.25 ) ) ) )
end

function HR(u, p, t)
    a, b, c, d, s, xr, r,  I, vs, k1, k2, el_link  = p
    x1, y1, z1, x2, y2, z2 = u
    
    du1 = y1 + b * x1 ^ 2 - a * x1 ^3 - z1 + I - k1 * ( x1 - vs ) * sigma(x2) + el_link * ( x2 - x1 )
    du2 = c - d * x1 ^2 - y1
    du3 = r * ( s * ( x1 - xr ) - z1 )
    
    du4 = y2 + b * x2 ^ 2 - a * x2 ^3 - z2 + I - k2 * ( x2 - vs ) * sigma(x1) + el_link * ( x1 - x2 )
    du5 = c - d * x2 ^2 - y2
    du6 = r * ( s * ( x2 - xr ) - z2 )
    return SVector(du1, du2, du3,
                    du4, du5, du6)
end

I typed it on my phone so there might be typos in there (I can’t test it). I think what you got wrong is that you used the inplace syntax f!(du, u, p, t) and you also returned a vector (unnecessary). I changed it to the allocating version f(u,p,t), returning the result du.

1 Like

Thank you, i try this

It speed up time
result benchmark:

edited system:

function HR!(du, u, p, t)
    function sigma(x)
            return 1.0 / ( 1.0 + exp( -10.0 * ( x  - ( - 0.25 ) ) ) )
    end
    @inbounds begin
    a, b, c, d, s, xr, r,  I, vs, k1, k2, el_link  = p
    x1, y1, z1, x2, y2, z2 = u
    
    du1 = y1 + b * x1 ^ 2 - a * x1 ^3 - z1 + I - k1 * ( x1 - vs ) * sigma(x2) + el_link * ( x2 - x1 )
    du2 = c - d * x1 ^2 - y1
    du3 = r * ( s * ( x1 - xr ) - z1 )
    
    du4 = y2 + b * x2 ^ 2 - a * x2 ^3 - z2 + I - k2 * ( x2 - vs ) * sigma(x1) + el_link * ( x1 - x2 )
    du5 = c - d * x2 ^2 - y2
    du6 = r * ( s * ( x2 - xr ) - z2 )
    end
    return SVector{6}(du1, du2, du3,
                    du4, du5, du6)
end

also it fixed using many RAM

The incorrect use of SVectors does exist here, but since the return is unused it just gets deleted by the compiler so that’s fine. Here’s a break down:

using OrdinaryDiffEq

function sigma(x)
    return 1.0 / ( 1.0 + exp( -10.0 * ( x  - ( - 0.25 ) ) ) )
end

function HR!(du, u, p, t)
        
    a, b, c, d, s, xr, r,  I, vs, k1, k2, el_link  = p
    x1, y1, z1, x2, y2, z2 = u
    
    du[1] = y1 + b * x1 ^ 2 - a * x1 ^3 - z1 + I - k1 * ( x1 - vs ) * sigma(x2) + el_link * ( x2 - x1 )
    du[2] = c - d * x1 ^2 - y1
    du[3] = r * ( s * ( x1 - xr ) - z1 )
    
    du[4] = y2 + b * x2 ^ 2 - a * x2 ^3 - z2 + I - k2 * ( x2 - vs ) * sigma(x1) + el_link * ( x1 - x2 )
    du[5] = c - d * x2 ^2 - y2
    du[6] = r * ( s * ( x2 - xr ) - z2 )
end

u0 = [ 0., 0.2, 0.3,
        0., 0.5, 0.6]
tspan = (0., 100000.)

a = 1.
b = 3.
c = 1.
d = 5.
xr = -1.6
r = 0.01 # 0.01
s = 5.
I = 4.
xv = 2.

k1= -0.17
k2 = k1
k = 0.0

p = [a, b, c, d,
        s, xr, r, I, xv, k1, k2, k]
prob = ODEProblem(HR!, [-1.5, 0.0, 0.0, -2.5, 0.0, 0.0], tspan, p)

using BenchmarkTools
@btime sol = solve(prob, AutoVern9(Rodas5()), abstol = 1e-11, reltol = 1e-11, maxiters = 10000000)
# 6.050 s (19333763 allocations: 1.86 GiB)

julia> sum(sol.alg_choice .== 2)
0

@btime sol = solve(prob, Vern9(), abstol = 1e-11, reltol = 1e-11, maxiters = 10000000)
# 3.773 s (16571721 allocations: 1.81 GiB)

@btime sol = solve(prob, Vern9(), abstol = 1e-11, reltol = 1e-11, dense = false, maxiters = 10000000)
# 2.032 s (1381039 allocations: 185.68 MiB)

@btime sol = solve(prob, Vern9(), abstol = 1e-11, reltol = 1e-11, save_everystep=false, maxiters = 10000000)
# 1.973 s (46 allocations: 14.42 KiB)

@profview sol = solve(prob, Vern9(), abstol = 1e-11, reltol = 1e-11, save_everystep=false, maxiters = 10000000)

function sigma(x)
    return @fastmath 1.0 / ( 1.0 + exp( -10.0 * ( x  - ( - 0.25 ) ) ) )
end

@btime sol = solve(prob, Vern9(), abstol = 1e-11, reltol = 1e-11, save_everystep=false, maxiters = 10000000)
# 1.797 s (46 allocations: 14.42 KiB)

What this is showing is that first of all, the biggest source of overhead is the existence of the possibility of switching, even though it’s not used. Since this equation is just a non-stiff equation (as shown by alg_choice being 1 over the whole integration), cutting out the overhead of holding onto the stiff solver cuts time down by about 2 seconds. The next major contribution is the saving overhead, which you can see if you run a profile, or another way to see it is just by turning off saving. That’s the next 2 seconds. Then when this is all optimized, you can see there’s an opportunity to @fastmath the exp operation which ends up being the biggest cost in the calculation by far, though this 0.2 is only meaningful if the other pieces are cut down.

The final profile is then:

Which is then the sign of optimized, because what it means is that the steps (perform_step) take up the vast majority of the time (so no loading or saving overhead), with the majority of the stepping time being due to a @fastmath exp call. That then identifies that specific operation as your limiting factor.

So in the end, most of the cost seems to be that it’s just a very big time span and taking a lot of steps because the tolerance is very low. With that in mind, then the biggest improvement would come from just finding an explicit ODE solver that takes less steps. I’ll continue the benchmarks in the no-saving mode because the improvements for this would apply even if the saving is done, though you’d need to about double the time if saving is done every step.

@btime sol = solve(prob, Vern8(), abstol = 1e-11, reltol = 1e-11, save_everystep=false, maxiters = 10000000)
# 866.393 ms (45 allocations: 11.89 KiB)

@btime sol = solve(prob, Vern7(), abstol = 1e-11, reltol = 1e-11, save_everystep=false, maxiters = 10000000)
# 1.608 s (43 allocations: 9.39 KiB)

@btime sol = solve(prob, Tsit5(), abstol = 1e-11, reltol = 1e-11, save_everystep=false, maxiters = 10000000)
# 2.369 s (41 allocations: 5.67 KiB)

Of the simple Runge-Kutta methods, it seems Vern8 is the most optimized method on this equation (at least, without checking the precision), solving in about half the function calls. Exactly why is uncertain here, we’d have dig into a lot more detail, but in general whether to use Vern7 vs 8 vs 9 is somewhat a toss up: this one is an 8.

Lastly, what @IlianPihlajamaa was referring to is that you can use a static array form instead of mutation. If you look at the computation, this actually slows it down:

function HR(u, p, t)
        
    a, b, c, d, s, xr, r,  I, vs, k1, k2, el_link  = p
    x1, y1, z1, x2, y2, z2 = u
    
    du1 = y1 + b * x1 ^ 2 - a * x1 ^3 - z1 + I - k1 * ( x1 - vs ) * sigma(x2) + el_link * ( x2 - x1 )
    du2 = c - d * x1 ^2 - y1
    du3 = r * ( s * ( x1 - xr ) - z1 )
    
    du4 = y2 + b * x2 ^ 2 - a * x2 ^3 - z2 + I - k2 * ( x2 - vs ) * sigma(x1) + el_link * ( x1 - x2 )
    du5 = c - d * x2 ^2 - y2
    du6 = r * ( s * ( x2 - xr ) - z2 )
    return SVector(du1, du2, du3,
                    du4, du5, du6)
end

p = SA[a, b, c, d,
        s, xr, r, I, xv, k1, k2, k]
probsa = ODEProblem(HR, SA[-1.5, 0.0, 0.0, -2.5, 0.0, 0.0], tspan, p)

@btime sol = solve(probsa, Vern8(), abstol = 1e-11, reltol = 1e-11, save_everystep=false, maxiters = 10000000)
# 891.670 ms (22 allocations: 10.42 KiB)

But if you are then adding saving, then the probsa static array form has better packing in the final result, and thus is beneficial:

@btime sol = solve(probsa, Vern8(), abstol = 1e-11, reltol = 1e-11, maxiters = 10000000)
# 1.566 s (1302581 allocations: 1.04 GiB)

@btime sol = solve(probsa, Vern8(), abstol = 1e-11, reltol = 1e-11, dense=false, maxiters = 10000000)
# 993.665 ms (46 allocations: 129.09 MiB)

@btime sol = solve(prob, Vern8(), abstol = 1e-11, reltol = 1e-11, maxiters = 10000000)
# 6.709 s (19532495 allocations: 2.15 GiB)

That gives us a best final time with saving as 1.5s, 870ms if not saving, and ~1s if you don’t need a dense interpolation.

I suggest then choosing saveat appropriately to avoid the interpolation. Then the SVector + saveat + Vern8 approach gives you a significant boost.

2 Likes

Thank you
What are you using for output how on screenshot?

I repeat your time measurement and has this result for time 300.000

@btime sol = solve(prob, AutoVern9(Rodas5()), abstol = 1e-11, reltol = 1e-11, maxiters = 10000000)
# 18.824 s (58154021 allocations: 5.53 GiB)

@btime sol = solve(prob, Vern9(), abstol = 1e-11, reltol = 1e-11, maxiters = 10000000)
# 16.055 s (49846224 allocations: 5.37 GiB)

@btime sol = solve(prob, Vern9(), abstol = 1e-11, reltol = 1e-11, dense = false, maxiters = 10000000)
# 5.006 s (4153916 allocations: 518.40 MiB)

@btime sol = solve(prob, Vern9(), abstol = 1e-11, reltol = 1e-11, save_everystep=false, maxiters = 10000000)
# 4.121 s (46 allocations: 14.42 KiB)

When i using save_everystep = false i have only two points in sol.u

After using static array i have this result:


I’m just using @profview in VS Code. It’s similar to:

Yes, that’s my point. Most of the time is due to saving output. If you don’t need a full continuous output, dense=false, save_everystep=false, or saveat will improve timings a lot.

Thank you