DiffEqGPU - error calling solve() for SDE driven by 2 correlated Brownians

Julia version: latest stable 1.12.5 on Linux, up to date on packages

I have a 2D SDE driven by 2 Brownians with correlation ρ, and I’m trying to solve on the GPU using SciML. I should note that I am separately able to Monte Carlo my drift and diffusion fine on the GPU (without SciML, using just CUDA), and I’m also able to solve the EnsembleProb on the CPU with EnsembleThreads(). But I’m struggling to get it to solve with DiffEqGPU.

Here’s the same code that fails when running solve:

using DifferentialEquations, DiffEqGPU, CUDA, Statistics, StaticArrays

function drift!(du, u, p, t)
    du[1] = 0.0
    du[2] = 0.0
end

function diffusion!(du, u, p, t)
    β, ν, ρ = p
    du[1, 1] = σ * u[1]^β
    du[1, 2] = 0.0

    # correlation handled w Cholesky
    du[2, 1] = ν * u[2] * ρ
    du[2, 2] = ν * u[2] * sqrt(1.0 - ρ^2)
end


function test()

    β = 0.5
    ν = 0.4 
    ρ = -0.3
    F0 = 0.03
    α = 0.20
    T = 1.0
    n_paths = 100_000
 
    u0    = @SVector Float32[F0, α]
    tspan = (0.0, T)
    p     =  @SVector Float32[β, ν, ρ]

    noise_rate_prototype = @SMatrix zeros(Float32, 2, 2)

    prob = SDEProblem(
        drift!,
        diffusion!,
        u0,
        tspan,
        p;
        noise_rate_prototype = noise_rate_prototype
    )

    ensemble_prob = EnsembleProblem(prob)

    sol = solve(
        ensemble_prob,
        GPUTsit5(),
        DiffEqGPU.EnsembleGPUKernel(CUDA.CUDABackend());
        dt           = T / 300,
        trajectories = n_paths,
        save_everystep = false,
        save_start     = false,
        adaptive = false
    )

    return sol

end


test()

And I get this error when running:

ERROR: UndefVarError: `kernel` not defined in local scope
Suggestion: check for an assignment to a local variable that shadows a global of the same name.
Stacktrace:
 [1] vectorized_solve(probs::CuArray{…}, prob::SDEProblem{…}, alg::GPUTsit5; dt::Float64, saveat::Nothing, save_everystep::Bool, debug::Bool, kwargs::@Kwargs{…})
   @ DiffEqGPU ~/.julia/packages/DiffEqGPU/ExLjA/src/ensemblegpukernel/lowerlevel_solve.jl:166
 [2] batch_solve_up_kernel(ensembleprob::EnsembleProblem{…}, probs::Vector{…}, alg::GPUTsit5, ensemblealg::DiffEqGPU.EnsembleGPUKernel{…}, I::UnitRange{…}, adaptive::Bool; kwargs::@Kwargs{…})
   @ DiffEqGPU ~/.julia/packages/DiffEqGPU/ExLjA/src/solve.jl:378
 [3] batch_solve(ensembleprob::EnsembleProblem{…}, alg::GPUTsit5, ensemblealg::DiffEqGPU.EnsembleGPUKernel{…}, I::UnitRange{…}, adaptive::Bool; kwargs::@Kwargs{…})
   @ DiffEqGPU ~/.julia/packages/DiffEqGPU/ExLjA/src/solve.jl:221
 [4] macro expansion
   @ ./timing.jl:461 [inlined]
 [5] __solve(ensembleprob::EnsembleProblem{…}, alg::GPUTsit5, ensemblealg::DiffEqGPU.EnsembleGPUKernel{…}; trajectories::Int64, batch_size::Int64, unstable_check::Function, adaptive::Bool, kwargs::@Kwargs{…})
   @ DiffEqGPU ~/.julia/packages/DiffEqGPU/ExLjA/src/solve.jl:69
 [6] solve(::EnsembleProblem{…}, ::GPUTsit5, ::Vararg{…}; kwargs::@Kwargs{…})
   @ SciMLBase ~/.julia/packages/SciMLBase/eTBDr/src/ensemble/basic_ensemble_solve.jl:387
 [7] test()
   @ Main ~/workspace/test/scratchpad/mwe.jl:220
 [8] top-level scope
   @ ~/workspace/test/scratchpad/mwe.jl:236
Some type information was truncated. Use `show(err)` to see complete types.

The suggestion provided by the error message doesn’t seem relevant in this case.

Curious if anybody has ever encountered something similar?

This is not a solver for SDEs. You need to use a solver for SDEs. Not sure why that’s the error you get but yeah it’s just a missing dispatch.

GPUEM and GPUSIEA are the documented methods for this. Did you give them a try?

Thanks Chris. I had a couple other issues in the code: drift and diffusion had to change to non inplace functions, and return stack allocated variables. I figured that part by looking at the documented API example for ODE on GPU, as examples for SDE on GPU are harder to find.

But doing that and using GPUEM() did the trick, thank you.

It also seems using ensemblealg as a kwarg instead of positional was silently sending me straight to the CPU (multithread) instead of GPU, which added to the confusion.

Yeah it looks like we need a bit better validation on the args here. Could you open an issue? A few proactive error messages could have made this 200x easier.

Will do as soon as I get a chance, thanks Chris

@ChrisRackauckas hopefully this is descriptive enough. Let me know if I can help in any other way. Much appreciated.

That is great, thank you.