Higher order derivatives in Flux

I’ll let the Flux.jl people answer, but worst-case, you will have to pass the ReverseDiff results to the tracked variable’s gradient data field, e.g.,

X.grad[:]=ReverseDiff.hessian(f,X.data)*X.data +ReverseDiff.gradient(f,X.data)

if your objective is (\nabla^T f) x. I don’t know whether that has any nasty side-effects.