Thanks for the example. I will share as little as possible but just a snippet in order to explain what’s going on to others. This is a good teaching opportunity.
I was able to find out that the problem is that, in the statement
k9*X3/(k10*k11/(k12*(1.-min(1.,k13+k14*X5))) + X3)
when X3
and X5
are dual numbers, the value of k11/(k12*(1.-min(1.,k13+k14*X5)))
is NaN
for the derivative components when the condition makes it the constant 1.0 (since the derivative of the minimum of in the denominator is 0). This causes a NaN
that propagates through to make everything NaN
and throws it to instability. . @jrevels that sounds not something ForwardDiff.jl could seemingly handle?
Fortunately there’s three things that you can do there.
Option 1: Finite Differencing
The first one is naive: just don’t use autodifferentiation. You can turn off autodifferentiation in the native solvers for example like:
sol = solve(prob,Rosenbrock23(autodiff=false))
sol = solve(prob,Rodas4(autodiff=false))
If you do this then Rosenbrock23
solves it in 299 steps and Rodas4
in 134. CVODE_BDF
solves it in 2399 steps, and the reason why it worked by default is that Sundials uses its own internal finite differencing approximation to the Jacobian so it’s basically autodiff=false
by default (but takes 10x more steps!). Here’s Rodas4
’s solution like this:
so that seems to be what you’re looking for.
Option 2: Define A Jacobian
The second thing that you can do is define a function for the Jacobian to avoid differencing. Either use this:
http://docs.juliadiffeq.org/latest/features/performance_overloads.html#Declaring-Explicit-Jacobians-1
or use @ode_def
and it’ll automatically calculate that. Then no autodiff or finite differencing is necessary because it can directly compute the Jacobian. The new reaction DSL should be able to do this soon too which would apply here since this is a chemical reaction model.
Now the last thing you could do is a little more complicated but more correct. The real problem is that the equation min(1.,k13+k14*X5)
has a discontinuous derivative. This not only causes an issue for autodifferentiation, but it also breaks some of the solver assumptions. ODE integrators assume that the f
function is differentiable to the degree that is the order of the method, and that is violated here. Adaptive time stepping is able to make them still handle these problems, but that requires taking extra steps around the points of discontinuity.
Fortunately there is a good way to solve this problem: replace non-smooth functions with a smooth approximation. One common example is instead of using
if x>5
v1
else
v2
end
see this approximation:
v1 = 1.0
v2 = 5.0
h(x,k) = (v2-v1)*tanh(k*(x-5))/2 + (v2-2v1)
pts = pts = linspace(1.0,8.0,100)
plot(pts,[h(x,1) for x in pts],lw=3)
plot!(pts,[h(x,10) for x in pts],lw=3)
plot!(pts,[h(x,100) for x in pts],lw=3)
Here, k
is the steepness of the transition. So hyperbolic tangent is a great way to smooth transitions between two functions.
A similar trick exists for min
. You can do something different. instead, do (exp(-k*x) + exp(-k*y)) / (exp(x) + exp(y))
as an approximation to min(x,y)
.
f(x) = min(1.,1.0+x)
function g(x,k)
ex = exp(-k)
ey = exp(-k*(1.0+x))
(ex + (1.0+x)*ey)/(ex+ey)
end
using Plots
pts = linspace(-2.0,2.0,100)
plot(pts,[f(x) for x in pts],lw=3)
plot!(pts,[g(x,1) for x in pts],lw=3)
plot!(pts,[g(x,10) for x in pts],lw=3)
plot!(pts,[g(x,100) for x in pts],lw=3)
Notice here that k
is the “steepness” once again, with higher k
being closer to the actual function (but with larger exponents so it can be less numerically stable). To balance the trade-off, I found that it’s good to use k=10
. Thus I replaced min(1.,k13+k14*X5))
in your code with
smooth_min_1 = (x) -> g(x,10)
smooth_min_1(k13+k14*X5)
Now your derivative function is smooth (has infinitely many derivatives). This makes Rosenbrock23
solve it in 28 steps, Rodas4
in 23, and CVODE_BDF
in 35. So this gives a significant speedup, like 10x less steps for the integrators!
One thing to note though is that this will increase the stiffness of the problem. If you’re using an L-stable integrator like we are here, that’s fine. If you’re using a Runge-Kutta method that can hurt performance.
Option 4: Using a Callback
The problem is the discontinuity, so we can just use a ContinuousCallback
to make sure we hit the time points of discontinuity. It’s just
condition(u,t,integrator) = k13+k14*u[5] - 1
affect!(integrator) = nothing
cb = ContinuousCallback(condition,affect!,save_positions=(false,false))
and then add this to the integration. This needs to use numerical differentiation to avoid the NaN (unless you put ForwardDiff in NaN-safe mode, see Jarrett’s post below).
sol = solve(prob,Rosenbrock23(autodiff=false), callback=cb)
sol = solve(prob,Rodas4(autodiff=false), callback=cb)
sol = solve(prob,CVODE_BDF(), callback=cb)
This solves it in 304 and 152, and 320 steps respectively. It hurts a little bit on the Rosenbrock methods but probably makes them a little more accurate. It helps the BDF method quite a bit to take into account the discontinuity!
Conclusion
The issue was the discontinuity in the derivative. You can use autodiff to avoid it or define the Jacobian yourself, but the best solution is to smooth your derivative. This will not only make it better defined, but it will also make it much quicker.
If you found this helpful, let me know and we can add this to the FAQ:
http://docs.juliadiffeq.org/latest/basics/faq.html
If you could share the example for the FAQ, or help me find an alternative, let me know. You didn’t mention whether or not this example could be shared so I’ve played it on the safe side for now.