I’m searching for parameters u to maximize a function obj(u). This objective function, however, depends on the derivatives of another function y(u,p), with regards to the parameters p.
An example:
abstract type S1 end
using ForwardDiff:Dual
function y(u,p1,p2)
p1_sen = Dual{S1}(p1,1.0,0.0)
p2_sen = Dual{S1}(p1,0.0,1.0)
n_u = size(u,2)
y = zeros(typeof(p1_sen*u[1,1]),n_u)
y[1] = 0.0
for i in 2:n_u
y[i] = y[i-1] + p1_sen*p2_sen*u[1,i] + p1_sen^2*u[2,i]
end
y
end
function obj(u)
p1 = 2.0
p2 = 3.0
sensitivities = y(u,p1,p2)
sensitivities[end].partials[1]*sensitivities[end].partials[2]
end
u = ones(2,100)
obj(u)
For efficient optimization I would like to calculate the gradient of obj(u) (preferably with reverse mode AD), and I’m not sure on how to do this for objectives already using other automatic differentiation tools.
I could calculate the Hessian of y(u,p), but this would be wasteful as not all second order derivatives of y are needed, only the mixed ones between u and p. In general there are many more parameters u than p.
Is p2 meant to be used here in, it doesn’t seem to be used in function y? Are you using a manual tag S1 on purpose? Why not use one of the higher level APIs?
By the way, it looks like you might be integrating an ODE in y and computing sensitivities wrt ODE parameters? In that case I think there’s already a lot of ecosystem support to do this at a high level of abstraction; I suggest reading https://docs.juliadiffeq.org/latest/analysis/sensitivity.html
Beyond those two comments, I think you should be able to compose calls to ForwardDiff.derivative / gradient etc and things should work fine provided you use the higher level APIs rather than using Dual directly.
The problem is indeed related to ODE’s. I’m aware of the sensitivity options provided by DiffEq. Two issues: even in the link you gave they use Dual directly for the more complicated examples, and my problem is more complex as I require some (but not all) second order derivatives of y(u,p). So I’m not sure direct use of Dual can be avoided, but if it could it would save me a lot of trouble.
Thanks, I’m currently using NLopt.jl to do something similar, but I’ll check out Optim.jl.