Is autodifferentiation possible in this situation?

There’s a whole extensive system for this, complete with each of the options you wanted and the options you didn’t know you wanted. The short story is, if you stick an ODE solve into ForwardDiff, ReverseDiff, Tracker, or Zygote, or ModelingToolkit, or AutoGrad.jl (or Nabla? Never treid that one) calls it’ll work. Hell, even TensorFlow will differentiate the DifferentialEquations.jl ODE solvers.

If you want the longer story:

https://diffeq.sciml.ai/stable/analysis/sensitivity/
https://diffeqflux.sciml.ai/dev/ControllingAdjoints/

Specifically you can then finish this with Quadrature.jl for representing the integral and taking it (and the Quadrature.jl wrapper is differentiable), or see the part on continuous adjoints of functionals.

If you’re looking for the Fisher information matrix, then maybe you’ll want to use second order adjoint stuff, and forward-over-reverse is generally recommended here. https://diffeq.sciml.ai/stable/analysis/sensitivity/#Second-Order-Sensitivity-Analysis-via-second_order_sensitivities-(Experimental). Also I would just try ForwardDiff.hessian on everything if you have small numbers of parameters (<100?).

You can in theory get around this constraint by using Enzyme

You could then Clang compile the C code and we could autodiff that. I haven’t setup a default for that, but I plan to next week. If you look at the Enzyme paper, I made sure some of the examples were PDE semidiscretizations for this reason. I can’t do it until next week though, unless you can ever convince @vchuravy to work on DiffEq stuff :wink: . You’ll be way out in the weeds though, so I’d highly recommend doing it with pure Julia code (or R code)

There’s a funny thing that happens in this case, at least for a lot of astro cases. For long time integrations you may want to use a symplectic integrator. They are used a lot throughout our classical physics models examples:

https://tutorials.sciml.ai/html/models/01-classical_physics.html

To get a clearer picture of what it does you can check out the Kepler model examples

https://tutorials.sciml.ai/html/models/05-kepler_problem.html

Now what does that have to do with autodiff? The reason optimize-then-discretize and discrete-then-optimize are different is because running the ODE method itself backwards (in autodiff) is different from its forwards. Running through the steps of Explicit Euler backwards is Implicit Euler, since the point at which f is approximated is the now going forwards and the future going backwards. But there are methods which are self-adjoint, the Trapezoidal method uses half of f from now and f from the future, which is the same going backwards. So methods with this self-adjointness property have the same derivative in both optimize-then-discretize and discretize-then-optimize if the steps are matched.

It turns out that all symplectic integrators are (a) fixed time step (except for some really weird cases which are fun to think about but haven’t really found a use yet) and (b) self-adjoint, which means astrophysics and other symplectic problems are the case where this distinction comes together.

So I don’t know if this story comes into play here, but it’s a fun one to share haha.

6 Likes