In Julia it is quite easy to write generic code that can be used with many numbers type (eg BigFloat) or many automatic differentiation algorithms. However if this code contains some iterative parts eg such as:
find t such that g(x(t,p),p) = 0 where x is some N-dimensional trajectory possibly numerically integrated, t is the independent variable of the trajectory (eg time) and p is a parameter vector, it seems a pity to use the differentiation scheme in all the iterations since the derivative can be obtained from the solution only:
This is very close to the difference between continuous sensitivity analysis methods ( http://docs.juliadiffeq.org/latest/analysis/sensitivity.html ) and discrete sensitivity analysis (AD). Continuous sensitivity analysis just builds another ODE while discrete sensitivity analysis propagates derivatives through the code (AD). We will be putting a paper out soon that shows that, unless the ODE is derived at compile time, discrete sensitivity analysis is much much faster.
That said, this means that we will want to continue to optimize discrete sensitivity analysis, and this piece right here is an optimization that can be done in the implicit solvers. It can be done almost generically. I mentioned in the Slack the other day that technically it cannot be done generically, but only because ForwardDiff.value is a different function from ReverseDiff.value which is a different function from Measurements.value etc. If a generic value function was in Julia Base, this could be written down. For now, itās a smallish optimization so we arenāt worried about it, but it is something we will add to all of DifferentialEquations.jl when the time comes.
Itās still generic, itās just optimized. Even if you add specific handling for ForwardDiff.Dual itās still generic, just better at handling standard cases. That is something to keep in mind with Julia code: multiple dispatch is about building the generic code and optimizing based on datatypes when you want to, not when you have to.
More generally, this idea of being smart with generic handling and separating the true continuous solution from the ācontrolā parts of code, both for AD and other applications like Measurements, is something that we are investigating quite deeply. It brings a whole new element to optimizing generic programming.
There is a typo in your doc. The left hand side of the ODE is using state variable u while the right hand side uses y.
Interesting. We are using both techniques at my workplace and AD is much slower but I cannot really compare specifics because the 2 pieces of code (one in Fortran, the other in C++) are different in so many other ways.
In your comparison are you using AD to obtain the coefficients of the linear sensitivity ODE (if so I guess the same AD technique against which you are benchmarking?)
In which journal will you be publishing?
and it ties into DiffEqDiffTools. Weāre going to take one last pass through checking for optimizations before really committing to the conclusion, but so far havenāt found anything.