Continuous callback does not work correctly with Ensemble GPUArray()

mabrudsky · November 14, 2022, 6:43pm

Hello, I am solving a central potential problem with EnsembleGPUArray() for multiple initial conditions, when I add the ContinuousCallback the first 800 trajectories ignore the affect!(integrator) and the last 200 do the cut correctly. If I solve for 10_000 trajectories, the first 8000 trajectories ignore the cut and the last 2000 do it correctly, similarly for 1_000_000 trajectories. I did several tests but I didn’t find what the problem is.

The code is the following

using CUDA
using DiffEqGPU
using NPZ
using OrdinaryDiffEq
using Plots
using PlotlyBase
using PlotlyKaleido

plotly()
Plots.PlotlyBackend()
CUDA.device!(1)

CuDevice(1): NVIDIA A30

"""
     pot_central!(du,u,p,t)
     u=[x,dx,y,dy,# of steps]
     p=[k,m,dt]
"""
function pot_central!(du,u,p,t)
        r3 = ( u[1]^2 + u[3]^2 )^(3/2)
     du[1] = u[2]                                     # u[2]= dx
     du[2] = -( p[1]*u[1] ) / ( p[2]*r3 )   
     du[3] = u[4]                                     # u[4]= dy
     du[4] = -( p[1]*u[3] ) / ( p[2]*r3 )
     du[5] = 1/p[3]
end

T=100.0
trajectories=1_000
dt=0.01

u_rand = convert(Array{Float64},npzread("IO_GPU/IO_u0.npy"))
    u0 = [2.0, 2.0, 1.0, 1.5,dt]
     p = [1.0,1.0,dt]               #[k,m,dt]
 tspan = [0.0,T]
    R2 = [4.5,5_000.0]              # R2=[Rmin2,Rmax2]

prob= ODEProblem{true}(pot_central!,u0,tspan,p)
prob_func= (prob,i,repeat) -> remake(prob,u0 = [u_rand[i,:];dt].*u0 + [1.0,1.0,1.0,1.0,dt])
Ensemble_Problem=EnsembleProblem(prob,prob_func=prob_func,safetycopy=false) 

function condition(u,t,integrator)
        r2 = u[1]*u[1] + u[3]*u[3]
        (R2[2] - r2)*(r2 - R2[1])
end

affect!(integrator) = terminate!(integrator)
gpu_cb =  ContinuousCallback(condition, affect!;save_positions=(true,false),rootfind=true,interp_points=0,abstol=1e-7,reltol=0)

ContinuousCallback{typeof(condition), typeof(affect!), typeof(affect!), typeof(SciMLBase.INITIALIZE_DEFAULT), typeof(SciMLBase.FINALIZE_DEFAULT), Float64, Int64, Rational{Int64}, Nothing, Int64}(condition, affect!, affect!, SciMLBase.INITIALIZE_DEFAULT, SciMLBase.FINALIZE_DEFAULT, nothing, SciMLBase.LeftRootFind, 0, Bool[1, 0], 1, 1.0e-7, 0, 1//100)

CUDA.@time sol= solve(Ensemble_Problem,
                                Tsit5(),
                                EnsembleGPUArray(),
                                dt=0.01,
                                trajectories=trajectories,
                                #batch_size=10_000,
                                callback = gpu_cb,
                                adaptive=false,
                                save_everystep=false
                                )

0.067974 seconds (780.84 k CPU allocations: 36.133 MiB) (52 GPU allocations: 551.062 KiB, 0.66% memmgmt time)
EnsembleSolution Solution of length 1000 with uType:
ODESolution{Float64, 2, uType, Nothing, Nothing, Vector{Float64}, rateType, P, Tsit5{typeof(OrdinaryDiffEq.trivial_limiter!), typeof(OrdinaryDiffEq.trivial_limiter!), Static.False}, IType, DiffEqBase.DEStats, Nothing} where {uType, rateType, P, IType}

The solutions for some trajectories are as follows

sol[1]

retcode: Success
Interpolation: 1st order linear
t: 2-element Vector{Float64}:
0.0
100.0
u: 2-element Vector{SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}}:
[1.6916643844888346, 1.3952481788899702, 1.4257929742644806, 2.3254414019025376, 0.0101]
[130.3632437338083, 1.283605087637319, 221.2362706355311, 2.193291036113921, 10000.010099999996]

sol[800]

retcode: Success
Interpolation: 1st order linear
t: 2-element Vector{Float64}:
0.0
100.0
u: 2-element Vector{SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}}:
[2.3286232359143177, 2.0849960853408054, 1.5481395680210261, 1.0878883702422468, 0.0101]
[198.03489179352627, 1.9510332210035648, 102.76307791913189, 1.008911077728196, 10000.010099999996]

sol[801]

retcode: Terminated
Interpolation: 1st order linear
t: 2-element Vector{Float64}:
0.0
28.22912660730258
u: 2-element Vector{Vector{Float64}}:
[1.3698092340031855, 2.018364836988968, 1.9808709009626209, 1.5912143969170325, 0.0101]
[55.503066919039185, 1.9067772301987758, 43.81106666791687, 1.4723417477028182, 2822.922760730258]

This histogram shows the distances r^2 for 1_000_000 trajectories.

callback_trajectories_r2

The first column is the trajectories that end at Rmin2=4.5, the second at Rmax2=5_000 and the rest those that ignore the Callback. This structure is repeated for any number of trajectories.

From already thank you very much.

Topic		Replies	Views
Why does EnsembleGPUArray not save at the given time points the solution? GPU	1	360	June 4, 2022
Continuously tracking callback with boolean condition General Usage question , ode , differentialequation , diffeqcallbacks	1	249	November 30, 2023
ContinuousCallback does not work for backward integration General Usage diffeq	2	28	May 5, 2025
Errors on 2nd Iteration of EnsembleGPUArrays when using Lux Modelling & Simulations	2	104	June 28, 2024
Is it possible to unsynch the EnsembleGPUArrays? GPU	6	494	May 20, 2022

Continuous callback does not work correctly with Ensemble GPUArray()

Related topics