Nested ForwardDiff.jacobian calls with inplace function

#1

I’m trying to compute the jacobian of a jacobian thanks to ForwardDiff. It works perfectly for a non inplace function, but for some reason that evades me, it breaks when the function to differentiate is inplace:

using ForwardDiff
using LinearAlgebra
#some data
x0 = randn(50000);
p0 = randn(2);

#pre allocate outputs
out = similar(x0);
outjac = Array{Float64}(undef, 50000, 2); 
outjac2 = Array{Float64}(undef,100000, 2)

#'regular' and in place function
f(p) =  @. p[1] * exp(-x0 * p[2]);
f!(out, p) =  @. out = p[1] * exp(-x0 * p[2]);

jac1 = p -> ForwardDiff.jacobian(f, p)
jac2 = p -> ForwardDiff.jacobian(f!,out, p)

jac1(p0) == jac2(p0) #outputs `true`

ForwardDiff.jacobian(jac1, p0) #works
ForwardDiff.jacobian(jac2, p0) #crashes

it outpus MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{getfield(Main, Symbol("##77#78")),Float64},Float64,2})

This code is actually a more minimal example of my actual use case. My end goal is to compute said jacobian entirely inplace (for the sake of performance). Here is the code I have so far, which also breaks:

using ForwardDiff
using LinearAlgebra
#some data
x0 = randn(50000);
p0 = randn(2);

#pre allocate outputs
out = similar(x0);
outjac = Array{Float64}(undef, 50000, 2); 
outjac2 = Array{Float64}(undef,100000, 2)

#'regular' and in place function
f(p) =  @. p[1] * exp(-x0 * p[2]);
f!(out, p) =  @. out = p[1] * exp(-x0 * p[2]);

g1 = (outjacin,p) -> (outjacin .= ForwardDiff.jacobian(f, p))
g2 = (outjacin,p) -> ForwardDiff.jacobian!(outjacin, f!, out, p)

g1(outjac,p0) == g2(outjac,p0) #returns true

ForwardDiff.jacobian!(outjac2,g1, outjac, p0) #works like a charm
ForwardDiff.jacobian!(outjac2,g2, outjac, p0); # breaks

Thanks a lot for your help!

#2

The inner pre-allocated Jacobian would need to be a matrix of ForwardDiff.Duals. g2 will be called with Dual{Tag1, Float64} inputs, which will then call f! with Dual{Tag2, <:Dual{Tag1, Float64}} inputs and will expect to store its output in an Array{Dual{Tag1, Float64}}, not an Array{Float64}. It’s a little tricky to get this right, as you have to predict the element types.

#3

One way of solving this kind of problem is as follows. Instead of preallocating stuff manually, we’ll first create a function make_buffer that knows how to create an appropriate work buffer for an in-place function given the function and its input argument. For f!, that would be:

make_buffer(::typeof(f!), p) = similar(p, length(x0))

This will use the element type of p to create the right type of work buffer. We can then use this to create a function that turns an in-place function into an out-of-place function:

function make_out_of_place_func(f!, x)
    y = make_buffer(f!, x)
    x -> f!(y, x)
end

An out-of-place function created using make_out_of_place_func only accepts one element type for x, but we want a solution that accepts generic xs, so that it works for ForwardDiff.Dual as well. One way to do that is as follows:

struct OutOfPlace{F}
    f!::F
    out_of_place_funcs::Dict{Type, Function}
end
OutOfPlace(f!) = OutOfPlace(f!, Dict{Type, Function}())

eval_f(f::F, x) where {F} = f(x) # function barrier
function (oop::OutOfPlace{F})(x) where {F}
    T = eltype(x)
    f = get!(() -> make_out_of_place_func(oop.f!, x), oop.out_of_place_funcs, T)
    eval_f(f, x)
end

const f = OutOfPlace(f!)

i.e., store the in-place function in a struct, along with a dictionary that maps element types of x to out-of-place functions, where a new out-of–place function is automatically generated if the element type of x has not been seen before, but the out-of-place function is reused if a suitable one has already been created. This is pretty efficient:

using BenchmarkTools, Random
@btime f!(out, p) setup = begin # 199.650 μs (0 allocations: 0 bytes)
    out = similar($x0)
    p = rand!(similar(p0))
end
@btime f(p) setup = p = rand!(similar(p0)); # 220.685 μs (1 allocation: 32 bytes)

We can now create an in-place Jacobian function based on the out-of-place version of function f:

# derivative just uses `f`, not `f!`
g! = (outjac, p) -> ForwardDiff.jacobian!(outjac, f, p)

@btime g!(outjac, p) setup = begin
    outjac = Array{Float64}(undef, length(x0), length(p0))
    p = rand!(similar(p0))
end # 388.684 μs (3 allocations: 208 bytes) (allocations would scale with size of `p`)

and then use the same trick again:

# same trick to turn g! into an out-of-place function
make_buffer(::typeof(g!), p) = similar(p, (length(x0), length(p)))
const g = OutOfPlace(g!)

h! = (outjac2, p) -> ForwardDiff.jacobian!(outjac2, g, p)

@btime h!(outjac2, p) setup = begin # 1.637 ms (6 allocations: 576 bytes)
    outjac2 = Array{Float64}(undef, length(x0) * length(p0), length(p0))
    p = rand!(similar(p0))
end

I’ve done something similar before in https://github.com/JuliaRobotics/RigidBodyDynamics.jl/blob/master/src/caches.jl#L1

Full code:
using ForwardDiff
using LinearAlgebra

const x0 = randn(50_000);
const p0 = randn(2);
f!(out, p) =  @. out = p[1] * exp(-x0 * p[2])

make_buffer(::typeof(f!), p) = similar(p, length(x0))

function make_out_of_place_func(f!, x)
    y = make_buffer(f!, x) # make_buffer overloads will be defined below
    x -> f!(y, x)
end

struct OutOfPlace{F}
    f!::F
    out_of_place_funcs::Dict{Type, Function}
end
OutOfPlace(f!) = OutOfPlace(f!, Dict{Type, Function}())

eval_f(f::F, x) where {F} = f(x) # function barrier
function (oop::OutOfPlace{F})(x) where {F}
    T = eltype(x)
    f = get!(() -> make_out_of_place_func(oop.f!, x), oop.out_of_place_funcs, T)
    eval_f(f, x)
end

const f = OutOfPlace(f!)

using BenchmarkTools, Random
@btime f!(out, p) setup = begin # 199.650 μs (0 allocations: 0 bytes)
    out = similar($x0)
    p = rand!(similar(p0))
end
@btime f(p) setup = p = rand!(similar(p0)); # 220.685 μs (1 allocation: 32 bytes)

# derivative just uses `f`, not `f!`
g! = (outjac, p) -> ForwardDiff.jacobian!(outjac, f, p)

@btime g!(outjac, p) setup = begin
    outjac = Array{Float64}(undef, length(x0), length(p0))
    p = rand!(similar(p0))
end # 388.684 μs (3 allocations: 208 bytes) (allocations would scale with size of `p`)

# same trick to turn g! into an out-of-place function
make_buffer(::typeof(g!), p) = similar(p, (length(x0), length(p)))
const g = OutOfPlace(g!)

h! = (outjac2, p) -> ForwardDiff.jacobian!(outjac2, g, p)

@btime h!(outjac2, p) setup = begin # 1.637 ms (6 allocations: 576 bytes)
    outjac2 = Array{Float64}(undef, length(x0) * length(p0), length(p0))
    p = rand!(similar(p0))
end
3 Likes
Autodiff: calculate just the diagonal of the Hessian
#4

Thanks a lot for this very thorough explanation, I’ll experiment with it more tomorrow, but it’s an amazing answer!

#5

I’m having some troubles generalising it for any f!, the ::typeof(f!) part doesn’t go well with closure. I’ll make make_buffer a property of the object, initialized when the object is created.

#6

What do you mean exactly? In my earlier post g! was a closure for example.

#7

Oh I figured it out!.
The problem I had was that I was trying to generate the h! function for an arbitrary f!, so I had to wrap some parts in a function, notably when creating the make_buffer functions.
Eventually I came up with:

function make_out_of_place_func(f!, x, make_buffer)
    y = make_buffer(f!, x) # make_buffer overloads will be defined below
    x -> f!(y, x)
end

struct OutOfPlace{F}
    f!::F
    out_of_place_funcs::Dict{Type, Function}
    make_buffer::Function
    
    function OutOfPlace(f!::T, dict, shape::Tuple) where T
        
        mk(::T, p) = similar(p, shape)
        new{T}(f!, dict, mk)
    end
end

OutOfPlace(f!, shape) = OutOfPlace(f!, Dict{Type, Function}(), shape)

eval_f(f::F, x) where {F} = f(x) # function barrier
function (oop::OutOfPlace{F})(x) where {F}
    T = eltype(x)
    f = get!(() -> make_out_of_place_func(oop.f!, x, oop.make_buffer), oop.out_of_place_funcs, T)
    eval_f(f, x)
end



function make_hessian(f!,x0,p0)
    
    f = OutOfPlace(f!, (length(x0),))
    g! = (outjac, p) -> (ForwardDiff.jacobian!(outjac, f, p); outjac = outjac')
    g = OutOfPlace(g!,(length(x0),length(p0)))
    h! = (outjac2, p) -> (ForwardDiff.jacobian!(outjac2, g, p); reshape(outjac2',length(p0),length(p0),length(x0)))
    h!
end