Hessian-vector product in place with ReverseDiff

I posted this question of the autodiff Slack channel but haven’t heard back.

I’m trying to compute Hessian-vector products in place with ReverseDiff. I tried the following

julia> function arglina(x)
         n = length(x)
         m = 2 * n
         return sum((x[i] - 2/m * sum(x[j] for j = 1:n) - 1)^2 for i = 1:n) + sum((-2/m * sum(x[j] for j = 1:n) - 1)^2 for i = n+1:m)
       end
arglina (generic function with 1 method)

julia> ∇f!(g, x) = ReverseDiff.gradient!(g, arglina, x);

julia> ∇²fv!(x, v, hv) = begin
           g = similar(x)
           ReverseDiff.gradient!(hv, x -> dot(∇f!(g, x), v), x)
       end
∇²fv! (generic function with 1 method)

julia> x = ones(5); v = copy(x); hv = similar(x);

julia> ∇²fv!(x, v, hv)
5-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0

Allocating new vectors for the result works:

julia> ∇f(x) = ReverseDiff.gradient(arglina, x);

julia> ∇²fv(x, v) = ReverseDiff.gradient(x -> dot(∇f(x), v), x);

julia> ∇²fv(x, v)
5-element Array{Float64,1}:
 2.0
 2.0
 2.0
 2.0
 2.0

Am I doing something wrong here? Or is this https://github.com/JuliaDiff/ForwardDiff.jl/issues/83?

Thanks!

It seems g should be similar to the x in the dot not the x input to ∇²fv!. Why the code above doesn’t error is probably a bug. You can open an issue.

1 Like

In general, in-place mutation and autodiff don’t mix well together in reverse-mode AD packages in Julia.

1 Like

Thanks for the comments. There’s special provision for in-place gradient in ReverseDiff though.

Replacing g = similar(x) with g = ReverseDiff.track.(zero(x)) works too.

This works:

∇²fv!(x, v, hv) = ReverseDiff.gradient!(hv, x -> begin g = similar(x); dot(∇f!(g, x), v) end, x)

Is that what you meant?

Yes! Sorry I wasn’t clear.

Many thanks! In fact, this appears to work as well:

∇²fv!(x, v, hv) = begin
  g = ReverseDiff.track.(zero(x))
  ReverseDiff.gradient!(hv, x -> dot(∇f!(g, x), v), x)
end
1 Like