Handling Instability When Solving ODE Problems

Thanks for the example. I will share as little as possible but just a snippet in order to explain what’s going on to others. This is a good teaching opportunity.

I was able to find out that the problem is that, in the statement

k9*X3/(k10*k11/(k12*(1.-min(1.,k13+k14*X5))) + X3)

when X3 and X5 are dual numbers, the value of k11/(k12*(1.-min(1.,k13+k14*X5))) is NaN for the derivative components when the condition makes it the constant 1.0 (since the derivative of the minimum of in the denominator is 0). This causes a NaN that propagates through to make everything NaN and throws it to instability. :frowning:. @jrevels that sounds not something ForwardDiff.jl could seemingly handle?

Fortunately there’s three things that you can do there.

Option 1: Finite Differencing

The first one is naive: just don’t use autodifferentiation. You can turn off autodifferentiation in the native solvers for example like:

sol = solve(prob,Rosenbrock23(autodiff=false))
sol = solve(prob,Rodas4(autodiff=false))

If you do this then Rosenbrock23 solves it in 299 steps and Rodas4 in 134. CVODE_BDF solves it in 2399 steps, and the reason why it worked by default is that Sundials uses its own internal finite differencing approximation to the Jacobian so it’s basically autodiff=false by default (but takes 10x more steps!). Here’s Rodas4’s solution like this:

so that seems to be what you’re looking for.

Option 2: Define A Jacobian

The second thing that you can do is define a function for the Jacobian to avoid differencing. Either use this:

http://docs.juliadiffeq.org/latest/features/performance_overloads.html#Declaring-Explicit-Jacobians-1

or use @ode_def and it’ll automatically calculate that. Then no autodiff or finite differencing is necessary because it can directly compute the Jacobian. The new reaction DSL should be able to do this soon too which would apply here since this is a chemical reaction model.

Now the last thing you could do is a little more complicated but more correct. The real problem is that the equation min(1.,k13+k14*X5) has a discontinuous derivative. This not only causes an issue for autodifferentiation, but it also breaks some of the solver assumptions. ODE integrators assume that the f function is differentiable to the degree that is the order of the method, and that is violated here. Adaptive time stepping is able to make them still handle these problems, but that requires taking extra steps around the points of discontinuity.

Fortunately there is a good way to solve this problem: replace non-smooth functions with a smooth approximation. One common example is instead of using

if x>5
  v1
else
  v2
end

see this approximation:

v1 = 1.0
v2 = 5.0
h(x,k) = (v2-v1)*tanh(k*(x-5))/2 + (v2-2v1)
pts = pts = linspace(1.0,8.0,100)
plot(pts,[h(x,1) for x in pts],lw=3)
plot!(pts,[h(x,10) for x in pts],lw=3)
plot!(pts,[h(x,100) for x in pts],lw=3)

Here, k is the steepness of the transition. So hyperbolic tangent is a great way to smooth transitions between two functions.

A similar trick exists for min. You can do something different. instead, do (exp(-k*x) + exp(-k*y)) / (exp(x) + exp(y)) as an approximation to min(x,y).

f(x) = min(1.,1.0+x)
function g(x,k)
    ex = exp(-k)
    ey = exp(-k*(1.0+x))
    (ex + (1.0+x)*ey)/(ex+ey)
end

using Plots
pts = linspace(-2.0,2.0,100)
plot(pts,[f(x) for x in pts],lw=3)
plot!(pts,[g(x,1) for x in pts],lw=3)
plot!(pts,[g(x,10) for x in pts],lw=3)
plot!(pts,[g(x,100) for x in pts],lw=3)

Notice here that k is the “steepness” once again, with higher k being closer to the actual function (but with larger exponents so it can be less numerically stable). To balance the trade-off, I found that it’s good to use k=10. Thus I replaced min(1.,k13+k14*X5)) in your code with

smooth_min_1 = (x) -> g(x,10)
smooth_min_1(k13+k14*X5)

Now your derivative function is smooth (has infinitely many derivatives). This makes Rosenbrock23 solve it in 28 steps, Rodas4 in 23, and CVODE_BDF in 35. So this gives a significant speedup, like 10x less steps for the integrators!

One thing to note though is that this will increase the stiffness of the problem. If you’re using an L-stable integrator like we are here, that’s fine. If you’re using a Runge-Kutta method that can hurt performance.

Option 4: Using a Callback

The problem is the discontinuity, so we can just use a ContinuousCallback to make sure we hit the time points of discontinuity. It’s just

condition(u,t,integrator) = k13+k14*u[5] - 1
affect!(integrator) = nothing
cb = ContinuousCallback(condition,affect!,save_positions=(false,false))

and then add this to the integration. This needs to use numerical differentiation to avoid the NaN (unless you put ForwardDiff in NaN-safe mode, see Jarrett’s post below).

sol = solve(prob,Rosenbrock23(autodiff=false), callback=cb)
sol = solve(prob,Rodas4(autodiff=false), callback=cb)
sol = solve(prob,CVODE_BDF(), callback=cb)

This solves it in 304 and 152, and 320 steps respectively. It hurts a little bit on the Rosenbrock methods but probably makes them a little more accurate. It helps the BDF method quite a bit to take into account the discontinuity!

Conclusion

The issue was the discontinuity in the derivative. You can use autodiff to avoid it or define the Jacobian yourself, but the best solution is to smooth your derivative. This will not only make it better defined, but it will also make it much quicker.

If you found this helpful, let me know and we can add this to the FAQ:

http://docs.juliadiffeq.org/latest/basics/faq.html

If you could share the example for the FAQ, or help me find an alternative, let me know. You didn’t mention whether or not this example could be shared so I’ve played it on the safe side for now.

33 Likes