Repeatedly calculating a hessian using both ForwardDiff and ReverseDiff

I would like to calculate a Hessian matrix of a function f:\mathbb{R}^n \rightarrow \mathbb{R} where n\gg1. In my usecase the Jacobian of f is always needed, and the Hessian may or may not be needed (but this is known before calculation of the Jacobian). Additionally, I only require the Hessian with respect to the first m dimensions, again where n \gg m > 1, this is because the first m input dimensions are parameters, and the latter are data.

I am aware that using diffresults one can recover the Jacobian from a single Hessian calculation, however given the dimensions of my Jacobian I felt that it made more sense to calculate the Jacobian using ReverseDiff and then the Hessian necessary part of the Hessian using ForwardDiff. Given the dimensions of the function I thought it best to first calculate the Jacobian (or indeed gradient) with ReverseDiff, and then use ForwardDiff for Hessian. Furthermore, I wish to use ReverseDiff as I will recompute the Jacobian and Hessian for multiple values of the input, so I can make use of ReverseDiff’s tape compilation.

The following is a minimal example demonstrating the error (without mention to the m-subsetting, although I welcome comments on this aspect in regards to feasibility/performance)

using ForwardDiff, ReverseDiff

f(x) = x[1]^2 + x[2]^2
inputs = [1.0, 1.0]

# This returns what one would expect
correct_output = ForwardDiff.jacobian(x -> ReverseDiff.gradient(f, x), inputs)
correct_output == ForwardDiff.hessian(f, inputs)

# This returns the error below
out = zeros(2)
compiled_ftape = ReverseDiff.compile(ReverseDiff.GradientTape(f, rand(2)))
ForwardDiff.jacobian(f_inputs -> ReverseDiff.gradient!(out, compiled_ftape, f_inputs), inputs)


MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2})
Closest candidates are:
  Float64(::Real, !Matched::RoundingMode) where T<:AbstractFloat at rounding.jl:200
  Float64(::T) where T<:Number at boot.jl:716
  Float64(!Matched::Float16) at float.jl:256

 [1] convert(::Type{Float64}, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2}) at .\number.jl:7
 [2] setindex!(::Array{Float64,1}, ::ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2}, ::Int64) at .\array.jl:847
 [3] _unsafe_copyto!(::Array{Float64,1}, ::Int64, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2},1}, ::Int64, ::Int64) at .\array.jl:257
 [4] unsafe_copyto! at .\array.jl:311 [inlined]
 [5] _copyto_impl! at .\array.jl:335 [inlined]
 [6] copyto! at .\array.jl:321 [inlined]
 [7] copyto! at .\array.jl:347 [inlined]
 [8] value! at C:\Users\jmmsm\.julia\packages\ReverseDiff\NoIPU\src\tracked.jl:156 [inlined]
 [9] seeded_forward_pass! at C:\Users\jmmsm\.julia\packages\ReverseDiff\NoIPU\src\api\tape.jl:41 [inlined]
 [10] gradient!(::Array{Float64,1}, ::ReverseDiff.CompiledTape{ReverseDiff.GradientTape{typeof(f),ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}},ReverseDiff.TrackedReal{Float64,Float64,Nothing}}}, ::Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2},1}) at C:\Users\jmmsm\.julia\packages\ReverseDiff\NoIPU\src\api\gradients.jl:79
 [11] (::var"#95#96")(::Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2},1}) at .\In[82]:1
 [12] vector_mode_dual_eval at C:\Users\jmmsm\.julia\packages\ForwardDiff\qTmqf\src\apiutils.jl:37 [inlined]
 [13] vector_mode_jacobian(::var"#95#96", ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#95#96",Float64},Float64,2,Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2},1}}) at C:\Users\jmmsm\.julia\packages\ForwardDiff\qTmqf\src\jacobian.jl:145
 [14] jacobian(::Function, ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#95#96",Float64},Float64,2,Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2},1}}, ::Val{true}) at C:\Users\jmmsm\.julia\packages\ForwardDiff\qTmqf\src\jacobian.jl:21
 [15] jacobian(::Function, ::Array{Float64,1}, ::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#95#96",Float64},Float64,2,Array{ForwardDiff.Dual{ForwardDiff.Tag{var"#95#96",Float64},Float64,2},1}}) at C:\Users\jmmsm\.julia\packages\ForwardDiff\qTmqf\src\jacobian.jl:19 (repeats 2 times)
 [16] top-level scope at In[82]:1
 [17] include_string(::Function, ::Module, ::String, ::String) at .\loading.jl:1091

I have no solutions, but am interested in answers also…

1 Like

Take a look at the type passed to f_inputs by ForwardDiff.jacobian. You have compiled the tape for Float64 arguments, but it gets passed a ForwardDiff.Dual.

Untested, but I guess you could do something like this:

const CACHE = Dict{DataType,Any}()
function inner(x::Vector{T}) where {T<:Real}
    if !haskey(CACHE, T)
        tape = ReverseDiff.compile(ReverseDiff.GradientTape(f, x))
        CACHE[T] = (tape, zeros(T, length(x)))
    tape, y = CACHE[T]
    return ReverseDiff.gradient!(y, tape, x)
ForwardDiff.jacobian(inner, inputs)

Thank you for this, your code worked perfectly. From what I understand this code compiles a gradient if it finds a new type being passed in, otherwise it uses the tape it already calculated for this type. Is this accurate?

For those wondering, there was a performance increase for inputs somewhere between dimension 50 and 500, at which point pure ReverseDiff.hessian was faster. This implementation also had far fewer allocations in any case, though the memory usage was generally similar or the same as ReverseDiff.hessian.

1 Like