How to view intermediate steps in ODESolver during runtime without excess allocations?

I am new to Julia and have been solving an ODE problem using the integrator interface. I want to view the intermediate steps (and use the intermediate steps in another function to perform some calculations) as the integrator solves the problem. This is how the code looks now:

prob = ODEProblem(differential_eq, u0, tspan, p)
integrator = OrdinaryDiffEq.init(prob, alg(), save_everystep = false, maxiters = max_steps*1e5, abstol=1e-8, reltol=1e-8, callback=callbacks; kwargs...)   

for integ in integrator
   fun(integ)    # Note: this function doesn't changes the values inside integ
end

This works fine, but while looking at the allocations, it seems this method allocates more memory than just using save_everystep, and performing the calculations afterward. As far as I understand, this extra allocation is coming from getproperty, specifically when I am trying to access the time steps or intermediate steps by doing i.t or i.u. I was wondering if there is a way to do this without allocating the extra memory.

Any help regarding this is appreciated!

Hi and welcome to the forum! :wave:

Did you try to wrap the whole code inside a function? As mentioned in the Performance Tips · The Julia Language using global variables should generally be avoided.

If the same behavior appears also without using global variables, it would be helpful to share a full working example of the code that can be copy-pasted to run and investigate the issue.

Hi Sevi! Thanks for your answer. The code is already inside a function, and there are no global variables.

Here is simplified version of the code that illustrates the issue:

using OrdinaryDiffEq
using BenchmarkTools

function diffeq(u, p, t)
    return -0.5 * u
end 

function integrate(alg=OrdinaryDiffEq.RK4)
    
    tspan = (0, 1e3)

    u0 = 1

    prob = ODEProblem(diffeq, u0, tspan)
    integrator = OrdinaryDiffEq.init(prob, alg(), save_everystep = false, maxiters = 1e4, abstol=1e-8, reltol=1e-8) 
    x = 0.0
    for i in integrator
        x = i.u 
    end
end 

@benchmark integrate()

Memory estimate: 32.52 KiB, allocs estimate: 1040.

#commenting out x = i.u and running @benchmark

Memory estimate: 25.39 KiB, allocs estimate: 632.

Running @benchmark on this would allocate almost twice the number of allocations it does if you comment out the x = i.u line inside the loop for the integrator. In my code I run the integrator multiple times, and hence the allocation difference gets much higher. I hope this describes the issue better and if so, is there a more efficient way to do this?

Check the profile. Where does the flame graph say the allocations are coming from?

Hi Chris! There seems to be two sources of allocations. First, getproperty seems to trigger some excess allocations.

Second, the function I am calling from inside the integrator interface (function “fun” in the original post) seems to trigger a type-instability (the profiler shows “Unknown type”). Note that I have used the same function outside of the integrator, by keeping save_everystep = true, and running the function at the end of solve with the integration data. In this case, the function is type-stable. But running it inside the loop triggers type-instability.

Just adding to this, if I run the function with some random arguments instead of passing the integrator.u values, then it’s type-stable again. So, I think passing the integrator.u as an argument to the function is triggering the issue. But then again, passing the solution.u to the same function at the end of the solve doesn’t trigger this issue.

Let me also add one last thing here. There seems to be a difference in the number of allocation while solving the ODE with solve vs the integrator interface. For the simple example I mentioned above, running @benchmark with solve!(integrator) shows:

Memory estimate: 8.06 KiB, allocs estimate: 158.

Whereas using the integrator interface (with no functions inside or excess calculations inside the loop), gives:

Memory estimate: 24.95 KiB, allocs estimate: 625.

I thought solve! and using the loop with the integrator interface are the same, but it doesn’t seems to be the case?

What if you don’t rely so much on inference and conversions? Change tspan = (0.0, 1e3) and u0 = 1.0. In theory that won’t do anything, but want to eliminate anything extra.

That seems to reduce the allocations a little bit. With integrator interface:

Memory estimate: 17.95 KiB, allocs estimate: 511.

With solve!():

Memory estimate: 5.81 KiB, allocs estimate: 120.

Basically, I want to do some calculations with the intermediate steps without actually saving the values. If there is a better way to implement this, please let me know.

Try:

function integrate(alg=OrdinaryDiffEq.RK4())
    
    tspan = (0.0, 1e3)

    u0 = 1.0

    prob = ODEProblem{false}(diffeq, u0, tspan)
    integrator = OrdinaryDiffEq.init(prob, alg, save_everystep = false, maxiters = 1e4, abstol=1e-8, reltol=1e-8) 
    x = 0.0
    for i in integrator
        x = i.u 
    end
end 

The method inference is supposed to be static on v1.10 @oxinabox @jameson ? Or is that v1.11?

2 Likes

This seems to reduce the allocations, but there seems to be a difference between solve!() vs integrator interface:

function integrate(alg=OrdinaryDiffEq.RK4())
    
    tspan = (0.0, 1e3)

    u0 = 1.0

    prob = ODEProblem{false}(diffeq, u0, tspan)
    integrator = OrdinaryDiffEq.init(prob, alg, save_everystep = false, maxiters = 1e4, abstol=1e-8, reltol=1e-8) 
    x = 0.0
    solve!(integrator)
end 

@benchmark integrate()

Memory estimate: 1.72 KiB, allocs estimate: 22.

Versus:

function integrate(alg=OrdinaryDiffEq.RK4())
    
    tspan = (0.0, 1e3)

    u0 = 1.0

    prob = ODEProblem{false}(diffeq, u0, tspan)
    integrator = OrdinaryDiffEq.init(prob, alg, save_everystep = false, maxiters = 1e4, abstol=1e-8, reltol=1e-8) 
    x = 0.0
    for i in integrator
    end
end 

@benchmark integrate()

Memory estimate: 13.97 KiB, allocs estimate: 414.

If you avoid the iterator interface and union splitting it seems fine:

function integrate(alg=OrdinaryDiffEq.RK4())
    
    tspan = (0.0, 1e3)

    u0 = 1.0

    prob = ODEProblem{false}(diffeq, u0, tspan)
    integrator = OrdinaryDiffEq.init(prob, alg, save_everystep = false, maxiters = 1e4, abstol=1e-8, reltol=1e-8) 
    x = 0.0
    while integrator.t < tspan[2]
        step!(integrator)
    end
end 

@btime integrate()
62.208 ÎĽs (22 allocations: 1.73 KiB)
1 Like

Hi Chris! Thank you for your answer. This solved the issue of the two methods allocating differently. But I am still getting a lot of allocations when trying to access any field of the integrator inside the loop from getproperty. Just to give an example, in the following code (from my original code) the profiler shows that 92% of the allocation is coming from the single line x::Float64 = integrator.u[2,1]:

    while true
        step!(integrator)
        x::Float64 = integrator.u[2, 1]
        if x < x1 || x > x2
            break
        end
    end

There are multiple GC flags, one of which comes from getproperty, and the other one from UnknownType. I also see allocations if I try to access integrator.t. I don’t understand where these allocations are coming from. In theory, all I want to do is access the intermediate steps in the integrator without saving them, and hence they shouldn’t allocate any memory at all?

I suspect that some of the type returned by the setup are not inferrable. Try a function barrier before you enter the loop and see whether this recovers the performance of solve!:

function inner(integrator)
    x = 0.0
    for i in integrator
        x = i.u 
    end
end

function integrate(alg=OrdinaryDiffEq.RK4())
    
    tspan = (0.0, 1e3)

    u0 = 1.0

    prob = ODEProblem{false}(diffeq, u0, tspan)
    integrator = OrdinaryDiffEq.init(prob, alg, save_everystep = false, maxiters = 1e4, abstol=1e-8, reltol=1e-8) 
    inner(integrator)
end 

@btime integrate()
2 Likes

Why so? Cthulhu says it’s fine and iirc there’s tests on that.

There does seem to be an issue with inference on the iterator though, which is odd because it should union split. @gbaraldi could you dig into this example a bit?

I remember that I had this problem once (some longer time ago though - maybe I just didn’t use the ODEProblem{true/false}) and didn’t check here specifically.

However, the function barrier is basically the only other difference between this code here and just calling solve that I can think of. Additionally, it is quick and easy to test so might still be worth a try :slight_smile:

Hey Adrian and Chris! So I implemented the function barrier, and this worked like a charm! The excess allocations are gone, and the integrator subfields type, at least inside the barrier function, are all inferrable.

There is still some small allocations coming from when the barrier function is called, because the type is still non-inferrable there. But these allocations are small since it’s a one time overhead, and doesn’t happens inside the loop of the integrator. I still cannot wrap my head around why this worked, but thanks @Adrian for suggesting this. The code works fine now :smiley:

Okay cool. We will dig into this some more and try to fix whatever is making that required.

2 Likes

1.10
To be precise: v1.10.0-DEV.609.
which makes hasmethod resolve at compile time.
But not methods, that still runs at runtime

1 Like