Integrators with analytic derivatives w/r/t initial conditions (especially the independent variable)

Hello,

I am working on an application that requires derivatives odes to compute partials for solvers. The approach I am taking is to prefer analytic partials (for performance, which is key), and fallback to AD or finite difference only when needed and at the lowest level. This means I need the partials of the solution to odes, w/r/t to initial conditions. For fixed step integration, this is relatively straightforward, but, fixed step is not ideal for obvious reasons.

Is there any work in the extensive DifferentialEquations ecosystem on ode integrators with analytic derivatives (especially w/r/t the independent variable). I could make do with fixed step, but adaptative step is better (and much, much harder))

You mean with respect to the dependent variable? W.r.t. the independent variable is just f. W.r.t. u0 is included in the SciMLSensitivity codes.

Note that AD actually uses the continuous adjoints by default. This is discussed in the SciMLSensitivity documentation. You can directly choose how the function is differentiated via the sensealg keyword:

or by using the direct interface

but we recommend just using the defaults in most scenarios.

Do you need to take the discretization into account, i.e. do you need the derivatives to nearly machine precision (including the derivative of the discretization error), or do you only need the derivative to the same accuracy as the ODE solution?

The latter (the “differentiate-then-discretize”) approach is much easier, is independent of the discretization scheme, and is better behaved for adaptive schemes — see ForwardDiff + Adaptive ODE Solvers: Timestep Issue Leads to Incorrect Derivatives - #4 by ChrisRackauckas

Another question is how many initial conditions you have (i.e. how big is your ODE system), and how many things do you need to differentiate?

  • If you have relatively few variables, and/or you want the full Jacobian matrix of the solution with respect to the initial conditions, you are probably better off with forward mode differentiation (whether by hand or using AD), which is simpler and is more efficient in this regime.
  • If you have lots of variables (\gtrsim 100), and you only need to differentiate one (or a few) quantities, such as a scalar “loss function” of the solution, you are probably better off with reverse mode / adjoint methods (whether by hand or using AD).

For a gentler introduction to these things, see e.g. chapter 9 of our Matrix Calculus course notes. For more in-depth coverage, see Chris’s review article.

I also wrote a little Julia tutorial implementing forward and reverse mode differentiate-then-discretize, by hand (mostly) following the course notes, to differentiate a small ODE solution with respect to initial conditions. It’s not necessarily any more efficient than using SciMLSensitivity.jl, however, which also uses AD in the analytical differentiate-then-discretize approach.