Dear all,
I’m writing a project that involves making parameterised mathematical models and generating cost functions for them. Mathematically, let’s take model output as
y(u,\theta) where u are inputs and \theta are parameters. I then want to automatically differentiate cost functions on the model output, i…e. get:
\nabla_{\theta} C[y(\theta,u)] and \nabla^2_{\theta} C[y(\theta,u)].
I tried to make a minimal example using Flux.jl. It crashes when I try to take the second derivative. Code is here:
using Flux, Flux.Tracker
function modelref(inputs,parameters)
u = inputs
p = parameters
## arbitrary smooth function
y = u[2].*p[1].^2 .+ u[1].*u[2].*(p[1].^3).*p[2].^4
return y
end
function costf(inputs, parameters, model)
## arbitrary cost function on modelref output
return modelref(inputs,param(parameters)) - sum(inputs.^2)
end
function rawcostf(inputs, model)
## make function of parameters only. better if i could do a partial derivative of costf wrt parameters
return p -> costf(inputs,p,model)
end
cf = rawcostf([1,2],modelref)
dcf(p) = Tracker.gradient(cf,p;nest=true)[1]
d2cf(p) = Tracker.jacobian(dcf,p)
However, when i call d2cf(p) I get the following error: ERROR: Nested AD not defined for getindex.
More generally, eventually I would like to substitute modelref with the output of ordinary differential equations, and generate cost functions on their trajectories. Would Flux be able to handle this?
I’m finding it hard to work out what’s going wrong as I’m not fluent in reading the flux source code. Any help would be much appreciated!
PS this is my first online post ever about a coding problem :). If I haven’t phrased my question in a conventional/easy to parse way, or I’ve made any faux pas, please let me know! Thanks a lot!