Automatic differentiation of an objective, which already uses AD tools

I’m searching for parameters u to maximize a function obj(u). This objective function, however, depends on the derivatives of another function y(u,p), with regards to the parameters p.
An example:

abstract type S1 end
using ForwardDiff:Dual
function y(u,p1,p2)
    p1_sen = Dual{S1}(p1,1.0,0.0)
    p2_sen = Dual{S1}(p1,0.0,1.0)
    n_u = size(u,2)
    y = zeros(typeof(p1_sen*u[1,1]),n_u)
    y[1] = 0.0
    for i in 2:n_u
        y[i] = y[i-1] + p1_sen*p2_sen*u[1,i] + p1_sen^2*u[2,i]
    end
    y
end
function obj(u)
    p1 = 2.0
    p2 = 3.0
    sensitivities = y(u,p1,p2)
    sensitivities[end].partials[1]*sensitivities[end].partials[2]
end

u = ones(2,100)
obj(u)

For efficient optimization I would like to calculate the gradient of obj(u) (preferably with reverse mode AD), and I’m not sure on how to do this for objectives already using other automatic differentiation tools.

I could calculate the Hessian of y(u,p), but this would be wasteful as not all second order derivatives of y are needed, only the mixed ones between u and p. In general there are many more parameters u than p.

Any advice would be appreciated.

Is p2 meant to be used here in, it doesn’t seem to be used in function y? Are you using a manual tag S1 on purpose? Why not use one of the higher level APIs?

By the way, it looks like you might be integrating an ODE in y and computing sensitivities wrt ODE parameters? In that case I think there’s already a lot of ecosystem support to do this at a high level of abstraction; I suggest reading https://docs.juliadiffeq.org/latest/analysis/sensitivity.html

Beyond those two comments, I think you should be able to compose calls to ForwardDiff.derivative / gradient etc and things should work fine provided you use the higher level APIs rather than using Dual directly.

I know you said

For efficient optimization I would like to calculate the gradient of obj(u)

but consider giving Optim.jl’s gradient free optimizers.
Like the PSO a shot.

At least just while you are working out a better solution.

Yes, I mistakenly used p1 twice.

The problem is indeed related to ODE’s. I’m aware of the sensitivity options provided by DiffEq. Two issues: even in the link you gave they use Dual directly for the more complicated examples, and my problem is more complex as I require some (but not all) second order derivatives of y(u,p). So I’m not sure direct use of Dual can be avoided, but if it could it would save me a lot of trouble.

Thanks, I’m currently using NLopt.jl to do something similar, but I’ll check out Optim.jl.