"Instability detected. Aborting" when increasing the dimension of an ODE system

Hello,

I am modelling soil bacteria degrading substrates with a spatially explicit model (i.e. a grid with one colony of bacteria in each grid cell). I have 15 equations to describe dynamics in each grid cell and I first consider a 10x10 grid. The model runs perfectly for this setup but, when I increase the grid size, the solver becomes less and less stable and aborts simulations “Warning: Instability detected. Aborting”. For instance, for a 32x32 grid, none of my simulations run until the end. The coding of the model shall be fine because:

  • List item Mass balance is respected and the system is modelled as a chemostat, which means that no state variable can grow exponentially. It behaves normally for a 3x3 or 10x10 grid.
  • List item Interactions in the grid only occur between the four neighbours, which means that increasing the size of the grid does not increases the calculation time per grid cell.
  • List item The system does not seem to be stiff because I carefully chose the units of parameters for their value to be not larger than 100. However I am not 100% confident on this point…

To what I understand, abortion is caused by the algorithm producing diverging trajectories. I tried to lower the relative and absolute errors or to test other solvers such as Rosenbrock23() or Rodas5P() instead of the default Tsit5() as suggested by previous posts but this did not solve my problem. I suspect I did not choose the appropriate algorithm or options for such a large system but I did not manage to identify any solution in the documentation…

Thanks for helping!

Can you give more details about the equations you are trying to solve? A MWE would be useful.

Unfortunately, I cannot really provide a MWE because I need hundreds of lines of code to implement the dynamics of the model, it is not a simple set of ODEs that can we written in a few lines. In a nutshell, the model represents mass transfers between compartments and it has several callbacks to implement discontinuities due to random events such as population collapse or grid cell colonisation. The dynamics cannot overshoot and dynamics are normal before the system aborts, it just seems that the algorithm randomly fails when the system becomes to big. It is fine with 1,500 equations but the solver has problems with 15,000 for instance.

Did you check and find out which ODE goes unstable, and why? Really hard to debug your model without having the equations, but this almost certainly sounds like a model problem if you can swap to different solvers (including C++ and Fortran ones) and every solver sees the same behavior.

1 Like

timescales can be important here… When solving such spatial systems, if you increase the spatial resolution you also need to increase the temporal resolution. try decreasing the dt and dtmax in the solvers, and if using adaptive ones decrease the tolerances.

Provide a MWE and exact solver options used as well.

for the typical system behavior. however if you take too large of a step you can diverge without it being expected or “physical”. Happens all the time in my systems even though they are very well physically bounded. Again timescales, dt, and solver options are important,.

Thanks for all your first comments!

According to my time series, none of my ODE diverges. Is there a function returning which state variable becomes unstable during simulations?

Here the solver options. As I said previously, I cannot provide a MWE because the ODE function represents hundreds of lines of code. I use an adaptive time step and I already decreased the tolerance. Default tolerance works fine for a system with 1,500 ODE but I got instability with 15,000 ODE, even with a tolerance of 1e-8.

tspan=(t_min, t_max)
event_step=collect(t_min:t_step:t_max-t_step)
record_step=collect(t_record:t_step_record:t_max)
# X represents the vector of initial conditions
cb1 = PositiveDomain(X; save = false, abstol = 1E-9)
# callback for random events
condition_2(u, t, integrator) = t in event_step
cb2 = DiscreteCallback(condition_2, function_event, save_positions=(false,false))
dynamics_cb=SavedValues(Float64,NTuple{28,Vector{Float64}}) # structure storing the other variables
# callback to save variables other than state variables
cb4=SavingCallback(function_SavingCallback,
                      dynamics_cb,
                       saveat=record_step)
cb = CallbackSet(cb1,cb2,cb4) # merges the callbacks
prob=ODEProblem(function_ODE,X,tspan,z,saveat=record_step,tstops=event_step,callback=cb,)
dynamics=solve(prob,Tsit5(),maxiters=Int(1e9),abstol=1e-8,reltol=1e-8)

I know that, it can be easily experimented with the basic Euler method. I just wanted you to know that I had already checked for the “mathematical” stability of the model.

did you do that?

So you tried Sundials.CVODE_BDF, ODEInterfaceDiffEq.radau, Rodas5P, and FBDF and all were unstable?

That’s going to change the stiffness, so you tested solvers for stiff equations as well?

1 Like

I have tried the solver you proposed but none of them worked.

dynamics=solve(prob,CVODE_BDF(),maxiters=Int(1e9),abstol=1e-8,reltol=1e-8,dtmax=0.1)

[CVODES ERROR] CVode
At t = 0 and h = 7.18621e-016, the corrector convergence test failed repeatedly or with |h| = hmin.

dynamics=solve(prob,radau(),maxiters=Int(1e9),abstol=1e-8,reltol=1e-8,dtmax=0.1)

ERROR: Sorry; radau5/radau does not support to change the solution inside the output function.
I guess the solver does not like the use of callbacks to implement discontinuities. I have to use callbacks to modify integrator.u to implement random events such as catastrophic colony collapse, colonisation and mutation.

dynamics=solve(prob,Rodas5P(autodiff=false),maxiters=Int(1e9),abstol=1e-12,reltol=1e-12,dtmax=0.1)

Instability detected. Aborting

dynamics=solve(prob,FBDF(autodiff=false),maxiters=Int(1e9),abstol=1e-8,reltol=1e-8,dtmax=0.1)

dt(1.1368683772161603e-13) <= dtmin(1.1368683772161603e-13) at t=0.0, and step error estimate = 1.0. Aborting. There is either an error in your model specification or the true solution
is unstable.

dynamics=solve(prob,lsoda(),maxiters=Int(1e9),abstol=1e-6,reltol=1e-6) 

ERROR: LSODA is not compatible with callbacks.

I could only run the model with Tsit5 so far.

I am trying it on the HPC.

Last time I faced this issue (also due to increasing the number of equations), I switched to a higher order integrator such as Vern7/Vern8 and reduced tolerances to 1e-10 or 1e-11 and that was enough in my case to make it work on the relevant timescales :slight_smile:

If all solvers do the same thing, even ones written in different languages, then I would highly suspect there’s something wrong with the equations.

@abraemer your solution seems to work, the first tests are encouraging!

That solution may develop problems as you refine the spatial grid. If your current grid is fine enough, then you will be fine.