Trouble using ForwardDiff on a function with a cache vector and a custom struct

I am having trouble calculating the gradient (via ForwardDiff.gradient) of a function that involves a cached vector and a custom struct. Here is my MWE:

module atest
using DiffEqBase: get_tmp, dualcache, DiffCache
using ForwardDiff

mutable struct Modelparameters{T2<:Function,T,chunk_size}
    gparam::DiffCache{Array{T,1},Array{ForwardDiff.Dual{nothing,T,chunk_size},1}}
    growthrate::T2
end

end
using ForwardDiff
using DiffEqBase: get_tmp, dualcache

function init()
    t1 = dualcache([1.5e-6, 1.5])
    t2 = atest.Modelparameters(t1, x -> get_tmp(t2.gparam,x)[1]*(x-1)^get_tmp(t2.gparam,x)[2])
    return t1,t2
end

function changeparams!(mp,p)
    tmp = get_tmp(mp.gparam,p)
    tmp .= p
end

t1,t2 = init()
result1 = t2.growthrate(2)
newp = [2e-6, 1.5]
changeparams!(t2, newp)
result2 = t2.growthrate(2)

f = p -> begin changeparams!(t2, p); t2.growthrate(1.5) end
f2 = p -> p[1]*(1.5-1)^p[2]
result3 = f([5e-6, 1.5])
result4 = f2([5e-6, 1.5])

gradient1 = ForwardDiff.gradient(f, [5e-6, 1.5])
gradient2 = ForwardDiff.gradient(f2, [5e-6, 1.5])

Now, the variables result1, result2, result3 and result4 contain the correct values (indicating that the updating of the parameters contained within the closure growthrate works as intended.

However, when I am trying to calculate the gradient on the function “f” with respect to the parameters using ForwardDiff (result stored in gradient1), I am getting [0.0, 0.0] as the answer. If I am calculating the gradient on f2 (which should be the same mathematical expression, just expressed in a different way), I am getting the correct result [0.35…, -1.225…e-6] (at least, I am intending to get this!).

Where am I going wrong with this? In other words: How do I get my changeparams!(), custom struct and ForwardDiff.gradient() to work together as I intend?

Any help appreciated :slight_smile:

Two things here. First of all, I think it’s much more sane to just use a callable struct here, rather than embedding the anonymous function into the struct. That will give you a single source of truth on the parameters. But that doesn’t fix the issue. The issue was x -> get_tmp(t2.gparam,x), the x you’re passing in was 1.5, so it was never a dual number. You need to make sure get_tmp is triggered by the duals, so you need to pass it p so that way it knows when it’s differentiating.

So a fixed version is:

using DiffEqBase: get_tmp, dualcache, DiffCache
using ForwardDiff

mutable struct Modelparameters{T,chunk_size}
    gparam::DiffCache{Array{T,1},Array{ForwardDiff.Dual{nothing,T,chunk_size},1}}
end

function (t2::Modelparameters)(x,p=nothing)
    if p !== nothing
        tmp = get_tmp(t2.gparam,p)
        tmp .= p
    else
        tmp = get_tmp(t2.gparam,x)
    end
    tmp[1]*(x-1)^tmp[2]
end

using ForwardDiff
using DiffEqBase: get_tmp, dualcache

function init()
    t1 = dualcache([1.5e-6, 1.5])
    t2 = Modelparameters(t1)
    return t2
end

t2 = init()
result1 = t2(2)
newp = [2e-6, 1.5]
changeparams!(t2, newp)
result2 = t2(2)

f = p -> t2(1.5,p)
f2 = p -> p[1]*(1.5-1)^p[2]
result3 = f([5e-6, 1.5])
result4 = f2([5e-6, 1.5])

gradient1 = ForwardDiff.gradient(f, [5e-6, 1.5])
gradient2 = ForwardDiff.gradient(f2, [5e-6, 1.5])