I have a very silly question. I am trying to understand how reverse-mode autodiff is able to actually work on coupled non-linear ODE systems that are very hard to solve backwards in time.
Take the Lorenz system (I am new to Julia so this is all Python code, but should be very simple to understand/adapt to Julia):
import numpy as np
from scipy.integrate import solve_ivp
import matplotlib.pyplot as plt
def lorenz(t, state, args):
x, y, z = state
rho, sigma, beta = args
return [sigma * (y - x), x * (rho - z) - y, x * y - beta * z]
# set up initial conditions, timespan for integration, and fiducial parameter values
y0 = np.array([5., 5., 5.])
tarr = np.linspace(0, 10., 1000)
params = [28., 10., 8/3.] # rho, sigma, beta
# solve the system forward in time
ys = solve_ivp(lorenz,[tarr[0],tarr[-1]],y0,dense_output=True,method='RK23',
atol=1e-12,rtol=1e-12,args=(params,),t_eval=tarr)
# now solve the system backward in time starting from the final state above and reversing time array
ys2 = solve_ivp(lorenz,[tarr[-1],tarr[0]],ys.y.T[-1],dense_output=True,method='BDF',
atol=1e-12,rtol=1e-12,args=(params,),t_eval=tarr[::-1])
# overplot the forward and backward solutions
fig, ax = plt.subplots(1,figsize=(6,4),dpi=150,subplot_kw={'projection':'3d'})
ax.plot(ys.y[0],ys.y[1],ys.y[2],'b-',lw=0.5)
end = -1 # or change this to -30 to disregard the last 30 chaotic backward plots
ax.plot(ys2.y[0][:end],ys2.y[1][:end],ys2.y[2][:end],'r-',lw=0.5,alpha=0.8) backward steps
With end=-1
, the plot looks like this (blue is the forward solution, and red is the backward solution):
You can see that the backward solution has gone horribly wrong. In fact the above solve_ivp call doesn’t even fully finish all the way to t=0 – it stops at t~7.42 with the error/warning “Required step size is less than spacing between numbers.” If we disregard the last 30 steps of the backward solution (end=-30
), we see that the solution was doing “decent” until it started spiraling out of control which probably then somehow led to the step size error:
This seems like chaotic dynamical system behavior. Notice that I am using relatively tight tolerances (atol=rtol=1e-12) and the BDF implicit/stiff solver for the backwards-in-time solve (none of the other solvers work at all for the backward solve – solve_ivp just seems to hang).
Now my questions:
- Would any of the Julia solvers be able to solve the original system backwards in time? Am I doing anything wrong above? Would the backward solve become easier for different parameters (rho,sigma,beta) and/or ICs (x,y,z)?
- Reverse-mode autodiff involves solving the sensitivity/adjoint ODEs alongside the original ODE system backwards in time. Given that I can’t even solve the original Lorenz system by itself backwards in time, how is reverse-mode autodiff able to do the backward ODE solve for the full augmented/adjoint system and propagate perturbations in the final state to corresponding linear perturbations in the initial input parameters?
- How can we trust the gradients found by reverse-mode autodiff for these kinds of systems that are difficult to solve backwards in time? (For simplicity, restrict ourselves to just the partials of the final state variable values wrt free parameters and ICs rather than the full time series of the sensitivities.)