Minimising DifferentialEquations GPU memory allocation

I’m simulating the Gross-Pitaevskii Equation for a three-dimensional array with M^3 values, and am running into some issues with GPU memory for large M values. In short, I’m wondering if there is anything more I can do to reduce how much memory solve() is allocating?

Currently I’m using CUDA.jl so that the dynamics are solved on the GPU, and am using a callback to intermittently save data back to CPU directly (see MWE below). I’ve managed to push N to ~ 300, but anything much bigger than that and it runs out of memory very quickly.

using FFTW, CUDA, DifferentialEquations, LinearAlgebra, Plots

    function kfunc_opt!(dψ,ψ)
        dψ .*= k2
        return nothing

    function GPE!(dψ,ψ,var,t) # GPE Equation 
        @. dψ = -(im + γ)*(0.5*dψ + (V_0 + abs2(ψ) - 1)*ψ)

    function GPU_Solve(EQ!, ψ, tspan) 
        savepoints = tspan[2:end]  
        condition(u, t, integrator) = t ∈ savepoints
        function affect!(integrator)                 
            i += 1
            ret[:,:,:,i] .= Array(integrator.u)
        ret = zeros(M,M,M,length(tspan)) .|> complex 
        ret[:,:,:,1] .= Array(ψ)
        cb = DiscreteCallback(condition, affect!)   
        i = 1                                           
        prob = ODEProblem(EQ!,ψ,(tspan[1],tspan[end]))   
        solve(prob, callback=cb, tstops = savepoints, save_start = false, save_everystep = false, save_end = false)
        return ret

    L = 8
    M = 60
    x = LinRange(-L,L,M) |> cu;
    dx = x[2] - x[1]
    kx = fftfreq(M,2π/dx) |> collect |> cu;
    dkx = kx[2] - kx[1]

    k2 =  kx.^2 .+ kx'.^2 .+ reshape(kx,(1,1,M)).^2;
    V_0 = 0.3*[i^2 + j^2 + k^2 for i in x, j in x, k in x] |> cu;

    const Pf = Float32(dx^3/(2π)^1.5)*plan_fft(cu((rand(M,M,M) + im*rand(M,M,M))));
    const Pi! = Float32(M^3*dkx^3/(2π)^1.5)*plan_ifft!(cu((rand(M,M,M) + im*rand(M,M,M))));

γ = 1
tspan = LinRange(0.0,3,30); 

res_GS = GPU_Solve(GPE!,cu((randn(M,M,M) + im*randn(M,M,M))),tspan);

    t = 7 # Change this to look at different times

If any other info is required, or if there are any other glaring issues with my code please let me know. Cheers.

Is the equation stiff? If you don’t need the Jacobian then you can specialize on that, and if you do the need the Jacobian you should declare its sparsity pattern.

1 Like

Dont you have globals? Pf and Pi!


My bad, have fixed the MWE and made sure it’s actually running properly. Let me know if it’s still a bit unclear

The biggest allocation is still going to be the Jacobian

I’m not 100% sure if it’s stiff or not, the main (dimensionless) equation of motion is:

\frac{\partial \psi (\textbf{r},t)}{\partial t} = -\frac{i}{h} \left[ \frac{\nabla^2}{2} + V_0(\textbf{r}) + | \psi |^2 \right] \psi

Where V_0 is just a number at each point. Would this be stiff because of the -|\psi|^2 \psi term?

Is that a typo? Doesn’t make sense to multiply by Pi! without doing anything with it.

Nah not a typo, Pi!*d\psi does an in-place inverse fourier transform on d\psi

Just try choosing an explicit RK method like Tsit5