Repeatedly calculating a hessian using both ForwardDiff and ReverseDiff

jacobusmmsmit · January 27, 2021, 12:21pm

I would like to calculate a Hessian matrix of a function f:\mathbb{R}^n \rightarrow \mathbb{R} where n\gg1. In my usecase the Jacobian of f is always needed, and the Hessian may or may not be needed (but this is known before calculation of the Jacobian). Additionally, I only require the Hessian with respect to the first m dimensions, again where n \gg m > 1, this is because the first m input dimensions are parameters, and the latter are data.

I am aware that using diffresults one can recover the Jacobian from a single Hessian calculation, however given the dimensions of my Jacobian I felt that it made more sense to calculate the Jacobian using ReverseDiff and then the Hessian necessary part of the Hessian using ForwardDiff. Given the dimensions of the function I thought it best to first calculate the Jacobian (or indeed gradient) with ReverseDiff, and then use ForwardDiff for Hessian. Furthermore, I wish to use ReverseDiff as I will recompute the Jacobian and Hessian for multiple values of the input, so I can make use of ReverseDiff’s tape compilation.

The following is a minimal example demonstrating the error (without mention to the m-subsetting, although I welcome comments on this aspect in regards to feasibility/performance)

using ForwardDiff, ReverseDiff

f(x) = x[1]^2 + x[2]^2
inputs = [1.0, 1.0]


# This returns what one would expect
correct_output = ForwardDiff.jacobian(x -> ReverseDiff.gradient(f, x), inputs)
correct_output == ForwardDiff.hessian(f, inputs)

# This returns the error below
out = zeros(2)
compiled_ftape = ReverseDiff.compile(ReverseDiff.GradientTape(f, rand(2)))
ForwardDiff.jacobian(f_inputs -> ReverseDiff.gradient!(out, compiled_ftape, f_inputs), inputs)

Error:

MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2})
Closest candidates are:
  Float64(::Real, !Matched::RoundingMode) where T<:AbstractFloat at rounding.jl:200
  Float64(::T) where T<:Number at boot.jl:716
  Float64(!Matched::Float16) at float.jl:256
  ...

Stacktrace:
 [1] convert(::Type{Float64}, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2}) at .\number.jl:7
 [2] setindex!(::Array{Float64,1}, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2}, ::Int64) at .\array.jl:847
 [3] _unsafe_copyto!(::Array{Float64,1}, ::Int64, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2},1}, ::Int64, ::Int64) at .\array.jl:257
 [4] unsafe_copyto! at .\array.jl:311 [inlined]
 [5] _copyto_impl! at .\array.jl:335 [inlined]
 [6] copyto! at .\array.jl:321 [inlined]
 [7] copyto! at .\array.jl:347 [inlined]
 [8] value! at C:\Users\jmmsm\.julia\packages\ReverseDiff\NoIPU\src\tracked.jl:156 [inlined]
 [9] seeded_forward_pass! at C:\Users\jmmsm\.julia\packages\ReverseDiff\NoIPU\src\api\tape.jl:41 [inlined]
 [10] gradient!(::Array{Float64,1}, ::ReverseDiff.CompiledTape{ReverseDiff.GradientTape{typeof(f),ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}},ReverseDiff.TrackedReal{Float64,Float64,Nothing}}}, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2},1}) at C:\Users\jmmsm\.julia\packages\ReverseDiff\NoIPU\src\api\gradients.jl:79
 [11] (::var"#95#96")(::Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2},1}) at .\In[82]:1
 [12] vector_mode_dual_eval at C:\Users\jmmsm\.julia\packages\ForwardDiff\qTmqf\src\apiutils.jl:37 [inlined]
 [13] vector_mode_jacobian(::var"#95#96", ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#95#96",Float64},Float64,2,Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2},1}}) at C:\Users\jmmsm\.julia\packages\ForwardDiff\qTmqf\src\jacobian.jl:145
 [14] jacobian(::Function, ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#95#96",Float64},Float64,2,Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2},1}}, ::Val{true}) at C:\Users\jmmsm\.julia\packages\ForwardDiff\qTmqf\src\jacobian.jl:21
 [15] jacobian(::Function, ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#95#96",Float64},Float64,2,Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2},1}}) at C:\Users\jmmsm\.julia\packages\ForwardDiff\qTmqf\src\jacobian.jl:19 (repeats 2 times)
 [16] top-level scope at In[82]:1
 [17] include_string(::Function, ::Module, ::String, ::String) at .\loading.jl:1091

lwhitefox · February 1, 2021, 10:09pm

I have no solutions, but am interested in answers also…

odow · February 1, 2021, 10:55pm

Take a look at the type passed to f_inputs by ForwardDiff.jacobian. You have compiled the tape for Float64 arguments, but it gets passed a ForwardDiff.Dual.

Untested, but I guess you could do something like this:

const CACHE = Dict{DataType,Any}()
function inner(x::Vector{T}) where {T<:Real}
    if !haskey(CACHE, T)
        tape = ReverseDiff.compile(ReverseDiff.GradientTape(f, x))
        CACHE[T] = (tape, zeros(T, length(x)))
    end
    tape, y = CACHE[T]
    return ReverseDiff.gradient!(y, tape, x)
end
ForwardDiff.jacobian(inner, inputs)

jacobusmmsmit · February 2, 2021, 1:28pm

Thank you for this, your code worked perfectly. From what I understand this code compiles a gradient if it finds a new type being passed in, otherwise it uses the tape it already calculated for this type. Is this accurate?

For those wondering, there was a performance increase for inputs somewhere between dimension 50 and 500, at which point pure ReverseDiff.hessian was faster. This implementation also had far fewer allocations in any case, though the memory usage was generally similar or the same as ReverseDiff.hessian.

Topic		Replies	Views
Gradiens of gradients using ReverseDiff Numerics	0	628	June 13, 2018
Nesting ForwardDiff inside ReverseDiff? General Usage question , differentiation	5	1286	January 4, 2018
Correct Way to Compose ReverseDiff and ForwardDiff Optimization (Mathematical) forwarddiff , reversediff , dual	8	1084	July 4, 2022
Avoiding redundant computation when using Zygote and ForwardDiff for forward-over-reverse hessians General Usage forwarddiff , reversediff , autodiff	4	70	March 12, 2025
Evaluation, gradient and Hessian of a scalar function for multiple values using ForwardDiff.jl Numerics	5	783	July 7, 2020

Repeatedly calculating a hessian using both ForwardDiff and ReverseDiff

Related topics