I’ve found that the interpolation in the Callbacks interface of DifferentialEquations.jl can slow down performance and blow up the memory allocations.
To demonstrate, I’ll use the cosine/sine example from the Callbacks doc. I’ve changed the example by making the ‘u’ parameter five times as long, to exacerbate the slow performance
using DifferentialEquations, StaticArrays, BenchmarkTools
function fun(u, p, t)
@SVector [u[2], -u[1], u[4], -u[3], u[6], -u[5], u[8], -u[7], u[10], -u[9]]
end
tspan = (0.0,1000.0)
u0 = @SVector [1., 0., 1., 0., 1., 0., 1., 0., 1., 0.]
prob = ODEProblem(fun,u0,tspan)
Benchmarking shows that this ODE without any callbacks takes about 1ms to complete.
@benchmark sol = solve(prob, DP5(), save_everystep=false)
> BenchmarkTools.Trial:
> memory estimate: 128.73 KiB
> allocs estimate: 2290
> --------------
> minimum time: 488.000 μs (0.00% GC)
> median time: 1.054 ms (0.00% GC)
> mean time: 1.096 ms (3.09% GC)
> maximum time: 8.769 ms (87.27% GC)
> --------------
> samples: 4524
> evals/sample: 1
Next, I solve the ODE with a simple callback. I use a dummy condition that always returns -1. Thus, this callback is never initiated, and little to no processing should be done. However, the memory allocations blow up ~1000x and the runtime is ~80x slower!
condition(u,t,integrator) = -1.
affect!(integrator) = terminate!(integrator)
cb = ContinuousCallback(condition, affect!, save_positions=(false, false))
@benchmark sol = solve(prob, DP5(), save_everystep=false, callback=cb)
> BenchmarkTools.Trial:
> memory estimate: 138.09 MiB
> allocs estimate: 2685005
> --------------
> minimum time: 57.506 ms (38.04% GC)
> median time: 82.732 ms (34.48% GC)
> mean time: 83.229 ms (35.62% GC)
> maximum time: 110.638 ms (24.88% GC)
> --------------
> samples: 61
> evals/sample: 1
I believe the performance bottleneck is the ‘interp_points’ parameter of the callback, which is set to 10 by default. At each evaluated time point, I am guessing that 10 new arrays (each of length 10) are being created. Repeated many times, this creates a big bottleneck.
If I set ‘interp_points’ to 0, then the solver becomes fast again.
cb = ContinuousCallback(condition, affect!, save_positions=(false, false), interp_points=0)
@benchmark sol = solve(prob, DP5(), save_everystep=false, callback=cb)
> BenchmarkTools.Trial:
> memory estimate: 128.22 KiB
> allocs estimate: 2285
> --------------
> minimum time: 506.000 μs (0.00% GC)
> median time: 1.125 ms (0.00% GC)
> mean time: 1.141 ms (2.89% GC)
> maximum time: 9.762 ms (84.72% GC)
> --------------
> samples: 4345
> evals/sample: 1
Question: How can I make the callback run faster while still having interpolation? Would it be possible to change the Callback implementation so that it writes to the same memory every time (rather than allocate new memory)? Perhaps each instance of a ContinuousCallback struct can store a field called interp_u
that is re-used again and again.