Generic code, iterative methods and automatic differentiation

In Julia it is quite easy to write generic code that can be used with many numbers type (eg BigFloat) or many automatic differentiation algorithms. However if this code contains some iterative parts eg such as:
find t such that g(x(t,p),p) = 0 where x is some N-dimensional trajectory possibly numerically integrated, t is the independent variable of the trajectory (eg time) and p is a parameter vector, it seems a pity to use the differentiation scheme in all the iterations since the derivative can be obtained from the solution only:

\frac{dt}{dp} = \frac { - \frac{\delta g}{\delta x} \frac{dx}{dp} - \frac{\delta g}{\delta p} } { \frac{\delta g}{\delta x} \frac{dx}{dt} + \frac{\delta g}{\delta t} }

eg find the solution without derivatives. Then with the solution and using an autodiff scheme, obtain all the derivatives

\frac{\delta g}{\delta x}, \frac{dx}{dp}, \frac{\delta g}{\delta p}, \frac{dx}{dt}, \frac{\delta g}{\delta t}

and then compute

\frac{dt}{dp}

But then this part of the code is not generic anymore. I am wondering how people deal with these kind of problems in large modelling applications.

Thank you.

This is very close to the difference between continuous sensitivity analysis methods ( http://docs.juliadiffeq.org/latest/analysis/sensitivity.html ) and discrete sensitivity analysis (AD). Continuous sensitivity analysis just builds another ODE while discrete sensitivity analysis propagates derivatives through the code (AD). We will be putting a paper out soon that shows that, unless the ODE is derived at compile time, discrete sensitivity analysis is much much faster.

That said, this means that we will want to continue to optimize discrete sensitivity analysis, and this piece right here is an optimization that can be done in the implicit solvers. It can be done almost generically. I mentioned in the Slack the other day that technically it cannot be done generically, but only because ForwardDiff.value is a different function from ReverseDiff.value which is a different function from Measurements.value etc. If a generic value function was in Julia Base, this could be written down. For now, itā€™s a smallish optimization so we arenā€™t worried about it, but it is something we will add to all of DifferentialEquations.jl when the time comes.

Itā€™s still generic, itā€™s just optimized. Even if you add specific handling for ForwardDiff.Dual itā€™s still generic, just better at handling standard cases. That is something to keep in mind with Julia code: multiple dispatch is about building the generic code and optimizing based on datatypes when you want to, not when you have to.

More generally, this idea of being smart with generic handling and separating the true continuous solution from the ā€œcontrolā€ parts of code, both for AD and other applications like Measurements, is something that we are investigating quite deeply. It brings a whole new element to optimizing generic programming.

2 Likes

Thank you for your reply.

There is a typo in your doc. The left hand side of the ODE is using state variable u while the right hand side uses y.

Interesting. We are using both techniques at my workplace and AD is much slower but I cannot really compare specifics because the 2 pieces of code (one in Fortran, the other in C++) are different in so many other ways.
In your comparison are you using AD to obtain the coefficients of the linear sensitivity ODE (if so I guess the same AD technique against which you are benchmarking?)
In which journal will you be publishing?

Yes, we are using ForwardDiff to generate the Jacobian if the user doesnā€™t supply one. You can take a look at our code here:

https://github.com/JuliaDiffEq/DiffEqSensitivity.jl/blob/master/src/local_sensitivity.jl#L36-L66

and it ties into DiffEqDiffTools. Weā€™re going to take one last pass through checking for optimizations before really committing to the conclusion, but so far havenā€™t found anything.

Hopefully it gets done by then!