Handling Instability When Solving ODE Problems

I have a model which have previously been constructed in matlab, now I want to try that model in Julia.
(The model is a system of differential equations in 5 variables which oscillates)
In matlab it is solved successfully using ode23s. The solution can be plotted and looks like:


(This is actually a sum of 2 variables, ploted for 5 different parameter values)

However when I try to solve it in Julia I get a WARNING: Instability detected. Aborting error. I have tried several different solvers (GenericImplicitEuler, GenericTrapezoid, Rosenbrock32, Rosenbrock23, Rodas4andRodas3`) and they all gives this problem. This is how it looks:


(Here u1+u3 corresponds to what is ploted in the matlab plot. Time 0 corresponds to 200 in the matlab plot)

I know little of how Julia handles instability, but while there is oscillation going on it do not look like the solution is on the verge of doom (maybe in a while?). I read that the matlab solver, ode23s, should correspond to Rosenbrock23(), however while the first fixes the problem the second does not.

Do anyone have any idea of what is going on? Or maybe someone have a good resource one were I can get some useful information on this kinds of thing (in Julia or in General)?

4 Likes

First of all, did you make sure that the Julia and MATLAB derivative functions are exactly the same? If you supplied Jacobians, are those the same? Secondly, what happens when the error tolerance is lower, or if you do something like dtmax = 1.0? Thirdly, give radau() or CVODE_BDF() a whirl: if they fail too then you can call the issue “universal”, i.e. not due to the integrator implementation.

Do you have code that can be shared?

CVODE_BDF() worked! Thank you :slight_smile:

Can you share this example? It would be good to have. (If necessary we can keep it private. I would like to see what’s going on though)

Thanks for the example. I will share as little as possible but just a snippet in order to explain what’s going on to others. This is a good teaching opportunity.

I was able to find out that the problem is that, in the statement

k9*X3/(k10*k11/(k12*(1.-min(1.,k13+k14*X5))) + X3)

when X3 and X5 are dual numbers, the value of k11/(k12*(1.-min(1.,k13+k14*X5))) is NaN for the derivative components when the condition makes it the constant 1.0 (since the derivative of the minimum of in the denominator is 0). This causes a NaN that propagates through to make everything NaN and throws it to instability. :frowning:. @jrevels that sounds not something ForwardDiff.jl could seemingly handle?

Fortunately there’s three things that you can do there.

Option 1: Finite Differencing

The first one is naive: just don’t use autodifferentiation. You can turn off autodifferentiation in the native solvers for example like:

sol = solve(prob,Rosenbrock23(autodiff=false))
sol = solve(prob,Rodas4(autodiff=false))

If you do this then Rosenbrock23 solves it in 299 steps and Rodas4 in 134. CVODE_BDF solves it in 2399 steps, and the reason why it worked by default is that Sundials uses its own internal finite differencing approximation to the Jacobian so it’s basically autodiff=false by default (but takes 10x more steps!). Here’s Rodas4’s solution like this:

so that seems to be what you’re looking for.

Option 2: Define A Jacobian

The second thing that you can do is define a function for the Jacobian to avoid differencing. Either use this:

http://docs.juliadiffeq.org/latest/features/performance_overloads.html#Declaring-Explicit-Jacobians-1

or use @ode_def and it’ll automatically calculate that. Then no autodiff or finite differencing is necessary because it can directly compute the Jacobian. The new reaction DSL should be able to do this soon too which would apply here since this is a chemical reaction model.

Now the last thing you could do is a little more complicated but more correct. The real problem is that the equation min(1.,k13+k14*X5) has a discontinuous derivative. This not only causes an issue for autodifferentiation, but it also breaks some of the solver assumptions. ODE integrators assume that the f function is differentiable to the degree that is the order of the method, and that is violated here. Adaptive time stepping is able to make them still handle these problems, but that requires taking extra steps around the points of discontinuity.

Fortunately there is a good way to solve this problem: replace non-smooth functions with a smooth approximation. One common example is instead of using

if x>5
  v1
else
  v2
end

see this approximation:

v1 = 1.0
v2 = 5.0
h(x,k) = (v2-v1)*tanh(k*(x-5))/2 + (v2-2v1)
pts = pts = linspace(1.0,8.0,100)
plot(pts,[h(x,1) for x in pts],lw=3)
plot!(pts,[h(x,10) for x in pts],lw=3)
plot!(pts,[h(x,100) for x in pts],lw=3)

Here, k is the steepness of the transition. So hyperbolic tangent is a great way to smooth transitions between two functions.

A similar trick exists for min. You can do something different. instead, do (exp(-k*x) + exp(-k*y)) / (exp(x) + exp(y)) as an approximation to min(x,y).

f(x) = min(1.,1.0+x)
function g(x,k)
    ex = exp(-k)
    ey = exp(-k*(1.0+x))
    (ex + (1.0+x)*ey)/(ex+ey)
end

using Plots
pts = linspace(-2.0,2.0,100)
plot(pts,[f(x) for x in pts],lw=3)
plot!(pts,[g(x,1) for x in pts],lw=3)
plot!(pts,[g(x,10) for x in pts],lw=3)
plot!(pts,[g(x,100) for x in pts],lw=3)

Notice here that k is the “steepness” once again, with higher k being closer to the actual function (but with larger exponents so it can be less numerically stable). To balance the trade-off, I found that it’s good to use k=10. Thus I replaced min(1.,k13+k14*X5)) in your code with

smooth_min_1 = (x) -> g(x,10)
smooth_min_1(k13+k14*X5)

Now your derivative function is smooth (has infinitely many derivatives). This makes Rosenbrock23 solve it in 28 steps, Rodas4 in 23, and CVODE_BDF in 35. So this gives a significant speedup, like 10x less steps for the integrators!

One thing to note though is that this will increase the stiffness of the problem. If you’re using an L-stable integrator like we are here, that’s fine. If you’re using a Runge-Kutta method that can hurt performance.

Option 4: Using a Callback

The problem is the discontinuity, so we can just use a ContinuousCallback to make sure we hit the time points of discontinuity. It’s just

condition(u,t,integrator) = k13+k14*u[5] - 1
affect!(integrator) = nothing
cb = ContinuousCallback(condition,affect!,save_positions=(false,false))

and then add this to the integration. This needs to use numerical differentiation to avoid the NaN (unless you put ForwardDiff in NaN-safe mode, see Jarrett’s post below).

sol = solve(prob,Rosenbrock23(autodiff=false), callback=cb)
sol = solve(prob,Rodas4(autodiff=false), callback=cb)
sol = solve(prob,CVODE_BDF(), callback=cb)

This solves it in 304 and 152, and 320 steps respectively. It hurts a little bit on the Rosenbrock methods but probably makes them a little more accurate. It helps the BDF method quite a bit to take into account the discontinuity!

Conclusion

The issue was the discontinuity in the derivative. You can use autodiff to avoid it or define the Jacobian yourself, but the best solution is to smooth your derivative. This will not only make it better defined, but it will also make it much quicker.

If you found this helpful, let me know and we can add this to the FAQ:

http://docs.juliadiffeq.org/latest/basics/faq.html

If you could share the example for the FAQ, or help me find an alternative, let me know. You didn’t mention whether or not this example could be shared so I’ve played it on the safe side for now.

31 Likes

You can always try ForwardDiff’s NaN-safe mode, though that won’t help if you’re running into “actual” discontinuities.

2 Likes

Ah yes, never knew about this. Cool! The chance of hitting exactly the discontinuity is essentially zero, so that works here. I can confirm that when you run it in NaN-safe mode it solves in 298 steps with Rosenbrock23 and Rodas4 in 134 steps. The issue is that 1/0.0.

1 Like

Thank you, that is a lot of help. I am trying to digest everything right now. It is a lot of information but it is also very educational.

Out of curiosity (if we want of continue with the education). Another way to handle discontinuities would be a callback, but you do not mention it here? Now making you approximation of the minimum function is a lot simpler then a callback, I prefer that solution.

If I e.g. wanted a parameter to behave like a step, would another option (as opposed to a callback) be to use a continuous approximation of a step?

It would be a very good thing to include your reply in the FAQ, that’f for sure. I am a bit reluctant to have the model copied straight over (since it is not actually mine). However the version that I gave you was pretty anonymized. If it is possible to remove some stuff without disturbing the (bad) behaviour that would be good and you could put it up, especially since it is given entirely out of context. If not I can try ask the original modeller next time we meet.

Oh yes, a ContinuousCallback that makes it stop every time k13+k14*X5 = 1 would work too. That seems like it would be more computationally-intensive though, but I didn’t test that. For reference, the callback I’d use is:

condition(u,t,integrator) = k13+k14*u[5] - 1
affect!(integrator) = nothing
cb = ContinuousCallback(condition,affect!,save_positions=(false,false))

I added that to the post.

Yes, you can do things like, instead of making it a parameter, make it a smooth function of time (or a smooth function in time between two parameters. Hyperbolic tangent would be good there).

Should this case throw an error like “ERROR: The derivative evaluated to NaN for u=[1,2,3]. See FAQ for techniques to handle discontinuous derivatives.”

@Datseris asked about this. Here’s what’s going on.

http://docs.juliadiffeq.org/latest/basics/common_solver_opts.html#Miscellaneous-1

unstable_check defaults to checking for NaNs (but can be overridden), and when that’s true it warns and exists here:

PRs to make this more clear are welcome.

Thank you! I have had exactly the same problem. This should be of great help!

1 Like

This was really useful, thanks.
In case anyone is still reading this, the sigmoid approximation for heaviside is wrong, it should be

h(x,k) = (v2-v1)*tanh(k*(x-5))/2 + (v1 + v2) / 2

(it just happened to work for v1 = 1.0 and v2 = 5.0)

6 Likes