Stochastic differential equation with high dimensions is using too much memory and taking too long

Hey! I want to integrate a system of 40+ differential equations each with independent additive white noise. With the current implementation I’m trying this is using way too much memory and is too slow. A minimal example of this is:

using StochasticDiffEq, DiffEqNoiseProcess, OrdinaryDiffEq

function system!(du, u, p, t)
    du .= 0
end

function noise!(du, u, p, t)
    noise = p[1]
    du .= noise
end

function test_memory_usage()
    N = 40
    noise = 0.1
    ps =  [noise]
    u0s = ones(Float64, N)
    tend = 50000.0
    ttrans = 0.0 
    noise_seed = 1
    alg = EM()
    dt = 0.01
    Δt = 10.0
    
    #1. For reference, integrate without noise 
    prob_ode = ODEProblem(system!, u0s, (0.0, tend), ps)
    sol_ode = @time solve(prob_ode, Euler(); dt, saveat=ttrans:Δt:tend)

    #2. with the Wiener Process it's way too slow 
    W = WienerProcess(0.0, zeros(length(u0s)))
    prob_independent_noise = SDEProblem(system!, noise!, u0s, (0.0, tend), ps, noise=W)
    sol_inde = @time solve(prob_independent_noise, alg; dt, saveat=ttrans:Δt:tend, seed=noise_seed);  

    #3. in place doesn't help
    W = WienerProcess!(0.0, zeros(length(u0s)))
    prob_independent_noise_inplace = SDEProblem(system!, noise!, u0s, (0.0, tend), ps, noise=W)
    sol_inde_in = @time solve(prob_independent_noise_inplace, alg; dt, saveat=ttrans:Δt:tend, seed=noise_seed);  

    #4. without the Wiener Processes it's way faster, but not the same result.
    prob_independent_noise_2 = SDEProblem(system!, noise!, u0s, (0.0, tend), ps)
    sol_inde_2 = @time solve(prob_independent_noise_2, alg; dt, saveat=ttrans:Δt:tend, seed=noise_seed);   

    #5. Single process, with dependent noise is also fast. 
    W = WienerProcess(0.0, 0.0)
    prob_dependent_noise = SDEProblem(system!, noise!, u0s, (0.0, tend), ps, noise=W)
    sol_de = @time solve(prob_dependent_noise, alg; dt, saveat=ttrans:Δt:tend, seed=noise_seed) 

    nothing
end

The results I got are:

  1.925835 seconds (2.47 M allocations: 136.644 MiB, 69.99% compilation time)
 15.889349 seconds (42.77 M allocations: 13.568 GiB, 35.68% gc time, 17.75% compilation time)
  7.392910 seconds (13.65 M allocations: 4.117 GiB, 29.45% gc time, 21.90% compilation time)
  2.718504 seconds (2.95 M allocations: 143.125 MiB, 47.44% compilation time)
  2.235974 seconds (3.59 M allocations: 396.477 MiB, 61.13% compilation time)

I also don’t understand why case 4 does not lead to the same result as case 1.

Could someone please help me here? I’m guessing I’m messing something up, but have no clue what.

Thanks a lot!

Check a profile. What is taking the memory and time? I wouldn’t be surprised if it’s the saving choice and the RNG.

Thanks a lot for the suggestion!

I’m a noob with profilers, but using ProfileView it seems that a lot of time is spent on WHITE_NOISE_DIST, which calls wiener_randn and then randn. Also some time is spent on copying noise processes at copy in copy_noise_types.jl

The saving choice is the same for all cases, even the non-problematic ones (1, 3, 5). So saving could only be causing problems in noise processes, correct? Which could explain the copy part.

I can also profile memory allocations specifically. Do you have any suggestions for which package to use?

makes sense.

Which line?

https://docs.julialang.org/en/v1/manual/profile/#Allocation-Profiler

Ok, with

    N = 40
    noise = 0.1
    ps =  [noise]
    u0s = ones(Float64, N)
    tend = 10000.0
    ttrans = 0.0 
    noise_seed = 1
    alg = EM()
    dt = 0.01
    Δt = 10.0
    
    W = WienerProcess(0.0, zeros(length(u0s)))
    prob_independent_noise = SDEProblem(system!, noise!, u0s, (0.0, tend), ps, noise=W)
    Profile.Allocs.clear()
    a = Profile.Allocs.@profile solve(prob_independent_noise, alg; dt, saveat=ttrans:Δt:tend, seed=noise_seed);  
    PProf.Allocs.pprof()

I got this flame graph:

As far as I understood, there are unnecessary allocations. The source for a few of the calls:

oh wait, you’re manually constructing a WienerProcess, but not using the in-place form? That’s going to make it a lot slower. Use WienerProcess!.

Classic Noise Processes · DiffEqNoiseProcess.jl!

Use WienerProcess!

I tried the in-place form like this:

    W = WienerProcess!(0.0, zeros(length(u0s)))
    prob_independent_noise_inplace = SDEProblem(system!, noise!, u0s, (0.0, tend), ps, noise=W)
    sol_inde_in = @time solve(prob_independent_noise_inplace, alg; dt, saveat=ttrans:Δt:tend, seed=noise_seed);  

(case 3 in my first post). It allocated less, but still a lot. Am I using it wrong?

What are the dominant spots in that case? It should be fairly different, since what you were pointing to before were the allocations that this would eliminate.

Fair, some of them are gone, consistent with a speed up and less allocations. But there are still copy calls:

Also, to reiterate, the solver runs really nicely without the noise process:

    prob_independent_noise_2 = SDEProblem(system!, noise!, u0s, (0.0, tend), ps)
    sol_inde_2 = @time solve(prob_independent_noise_2, alg; dt, saveat=ttrans:Δt:tend, seed=noise_seed);   

though the results differ. Do you know why?

Thanks again :slight_smile:

Try W = WienerProcess!(0.0, zeros(length(u0s)), save_everystep = false)

That did it, awesome! Now it’s same speed and memory as solving without the WienerProcess!

One last question: why is the solution different with and without specifying the WienerProcess?

    prob_1 = SDEProblem(system!, noise!, u0s, (0.0, tend), ps)
    prob_2 = SDEProblem(system!, noise!, u0s, (0.0, tend), ps, noise=W)

Are the realizations of the noise just different?

Thanks a lot! :grin:

That should be all that’s different.

Seems to be. Awesome, thanks once more!