Thanks for the example. I will share as little as possible but just a snippet in order to explain what’s going on to others. This is a good teaching opportunity.
I was able to find out that the problem is that, in the statement
k9*X3/(k10*k11/(k12*(1.-min(1.,k13+k14*X5))) + X3)
X5 are dual numbers, the value of
NaN for the derivative components when the condition makes it the constant 1.0 (since the derivative of the minimum of in the denominator is 0). This causes a
NaN that propagates through to make everything
NaN and throws it to instability. . @jrevels that sounds not something ForwardDiff.jl could seemingly handle?
Fortunately there’s three things that you can do there.
Option 1: Finite Differencing
The first one is naive: just don’t use autodifferentiation. You can turn off autodifferentiation in the native solvers for example like:
sol = solve(prob,Rosenbrock23(autodiff=false))
sol = solve(prob,Rodas4(autodiff=false))
If you do this then
Rosenbrock23 solves it in 299 steps and
Rodas4 in 134.
CVODE_BDF solves it in 2399 steps, and the reason why it worked by default is that Sundials uses its own internal finite differencing approximation to the Jacobian so it’s basically
autodiff=false by default (but takes 10x more steps!). Here’s
Rodas4's solution like this:
so that seems to be what you’re looking for.
Option 2: Define A Jacobian
The second thing that you can do is define a function for the Jacobian to avoid differencing. Either use this:
@ode_def and it’ll automatically calculate that. Then no autodiff or finite differencing is necessary because it can directly compute the Jacobian. The new reaction DSL should be able to do this soon too which would apply here since this is a chemical reaction model.
Now the last thing you could do is a little more complicated but more correct. The real problem is that the equation
min(1.,k13+k14*X5) has a discontinuous derivative. This not only causes an issue for autodifferentiation, but it also breaks some of the solver assumptions. ODE integrators assume that the
f function is differentiable to the degree that is the order of the method, and that is violated here. Adaptive time stepping is able to make them still handle these problems, but that requires taking extra steps around the points of discontinuity.
Fortunately there is a good way to solve this problem: replace non-smooth functions with a smooth approximation. One common example is instead of using
see this approximation:
v1 = 1.0
v2 = 5.0
h(x,k) = (v2-v1)*tanh(k*(x-5))/2 + (v2-2v1)
pts = pts = linspace(1.0,8.0,100)
plot(pts,[h(x,1) for x in pts],lw=3)
plot!(pts,[h(x,10) for x in pts],lw=3)
plot!(pts,[h(x,100) for x in pts],lw=3)
k is the steepness of the transition. So hyperbolic tangent is a great way to smooth transitions between two functions.
A similar trick exists for
min. You can do something different. instead, do
(exp(-k*x) + exp(-k*y)) / (exp(x) + exp(y)) as an approximation to
f(x) = min(1.,1.0+x)
ex = exp(-k)
ey = exp(-k*(1.0+x))
(ex + (1.0+x)*ey)/(ex+ey)
pts = linspace(-2.0,2.0,100)
plot(pts,[f(x) for x in pts],lw=3)
plot!(pts,[g(x,1) for x in pts],lw=3)
plot!(pts,[g(x,10) for x in pts],lw=3)
plot!(pts,[g(x,100) for x in pts],lw=3)
Notice here that
k is the “steepness” once again, with higher
k being closer to the actual function (but with larger exponents so it can be less numerically stable). To balance the trade-off, I found that it’s good to use
k=10. Thus I replaced
min(1.,k13+k14*X5)) in your code with
smooth_min_1 = (x) -> g(x,10)
Now your derivative function is smooth (has infinitely many derivatives). This makes
Rosenbrock23 solve it in 28 steps,
Rodas4 in 23, and
CVODE_BDF in 35. So this gives a significant speedup, like 10x less steps for the integrators!
One thing to note though is that this will increase the stiffness of the problem. If you’re using an L-stable integrator like we are here, that’s fine. If you’re using a Runge-Kutta method that can hurt performance.
Option 4: Using a Callback
The problem is the discontinuity, so we can just use a
ContinuousCallback to make sure we hit the time points of discontinuity. It’s just
condition(u,t,integrator) = k13+k14*u - 1
affect!(integrator) = nothing
cb = ContinuousCallback(condition,affect!,save_positions=(false,false))
and then add this to the integration. This needs to use numerical differentiation to avoid the NaN (unless you put ForwardDiff in NaN-safe mode, see Jarrett’s post below).
sol = solve(prob,Rosenbrock23(autodiff=false), callback=cb)
sol = solve(prob,Rodas4(autodiff=false), callback=cb)
sol = solve(prob,CVODE_BDF(), callback=cb)
This solves it in 304 and 152, and 320 steps respectively. It hurts a little bit on the Rosenbrock methods but probably makes them a little more accurate. It helps the BDF method quite a bit to take into account the discontinuity!
The issue was the discontinuity in the derivative. You can use autodiff to avoid it or define the Jacobian yourself, but the best solution is to smooth your derivative. This will not only make it better defined, but it will also make it much quicker.
If you found this helpful, let me know and we can add this to the FAQ:
If you could share the example for the FAQ, or help me find an alternative, let me know. You didn’t mention whether or not this example could be shared so I’ve played it on the safe side for now.