ForwardDiff caches usage

I have been trying to implement a cache correctly for ForwardDiff similar to DiffEqCache without too much success.

My final objective is to have a Jacobian function of the system that we can reuse in multiple steps since we make many calculations with it before calling a DiffEq solver, so it is wasteful to obtain the function multiple independent times when we could reuse it and even get the sparsity pattern, etc. similar to what is done in SparseDiffTools.jl (which by the way doesn’t work for code structured in the code like the MWE).

This to be used in PowerSimulationsDynamics.jl where we are currently using a lot of Vector{Real} and taking a performance hit to avoid the conversion error.

For reference, I looked into the notes about dual numbers from the SciML documentation and DiffEqCache and this discussion.

Also followed the discussion here: automatic differentiation not working in advanced examples? · Issue #387 · SciML/SciMLTutorials.jl · GitHub

Here is an MWE that I am trying to make work that can be executed:


using ForwardDiff
using Random

mutable struct modelf <: Function
    common::Vector
    common_dual::Vector
    du::Vector
    du_dual::Vector
    function modelf(du0::Vector{T}) where T <: Number
        # Forced the CHUNK to 3 for the MWE.
        U = ForwardDiff.Dual{nothing, T, 3}
        new(zeros(T, 3), zeros(U, 3), similar(du0), zeros(U, length(du0)))
    end
end

get_common(m::modelf, ::Type{Float64}) = m.common
get_common(m::modelf, ::Type{T}) where T <: ForwardDiff.Dual = m.common_dual
get_du(m::modelf, ::Type{Float64}) = m.du
get_du(m::modelf, ::Type{T}) where T <: ForwardDiff.Dual = m.du_dual

function (f::modelf)(u::AbstractArray{T}, t) where T
    system(get_du(f, T), u, get_common(f, T), t)
    return get_du(f, T)
end

function component1(u, common) where T
    # The code fails in this line
    common[1] = 1.0*u[1]
    return -0.1 * u[1]
end

function component2(u, common) where T
    common[2] = -0.8 * u[2]
    return -0.4 * u[2]
end

function component3(u, common) where T
    return sum(common .* [-0.1, 0.2, 0.01]) + 0.02 * u[3]
end

function system(du, u, common, t) where T
    du[1] = component1(u, common)
    du[2] = component2(u, common)
    du[3] = component3(u, common)
    nothing
end

model = modelf(zeros(3))
model(zeros(3), 0)

J = ForwardDiff.jacobian((u) -> model(u, 0), rand(3))

The code is structured in a way that there is a common vector where the different components store intermediate values (please ignore the fact in this example the information in common and in u are the same, this is just for the MWE)

The error I keep getting is a type conversion error that I don’t understand why is happening if the vector common and u should have the same type:

ERROR: LoadError: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 3})
The closest candidates are:
  (::Type{T})(::Real, ::RoundingMode) where T<:AbstractFloat at rounding.jl:200
  (::Type{T})(::T) where T<:Number at boot.jl:760
  (::Type{T})(::AbstractChar) where T<:Union{AbstractChar, Number} at char.jl:50
  ...
Stacktrace:
  [1] convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 3})
    @ Base ./number.jl:7
  [2] convert(#unused#::Type{ForwardDiff.Dual{nothing, Float64, 3}}, x::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 3})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/QOqCN/src/dual.jl:371
  [3] setindex!(A::Vector{ForwardDiff.Dual{nothing, Float64, 3}}, x::ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 3}, i1::Int64)
    @ Base ./array.jl:839
  [4] component1(u::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 3}}, common::Vector{ForwardDiff.Dual{nothing, Float64, 3}})
    @ Main ~/.julia/dev/PowerSimulationsDynamics/_dev.jl:333
  [5] system(du::Vector{ForwardDiff.Dual{nothing, Float64, 3}}, u::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 3}}, common::Vector{ForwardDiff.Dual{nothing, Float64, 3}}, t::Int64)
    @ Main ~/.julia/dev/PowerSimulationsDynamics/_dev.jl:347
  [6] (::modelf)(u::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 3}}, t::Int64)
    @ Main ~/.julia/dev/PowerSimulationsDynamics/_dev.jl:327
  [7] (::var"#5#6")(u::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 3}})
    @ Main ~/.julia/dev/PowerSimulationsDynamics/_dev.jl:356
  [8] vector_mode_dual_eval
    @ ~/.julia/packages/ForwardDiff/QOqCN/src/apiutils.jl:37 [inlined]
  [9] vector_mode_jacobian(f::var"#5#6", x::Vector{Float64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 3, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 3}}})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/QOqCN/src/jacobian.jl:147
 [10] jacobian(f::Function, x::Vector{Float64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 3, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 3}}}, ::Val{true})
    @ ForwardDiff ~/.julia/packages/ForwardDiff/QOqCN/src/jacobian.jl:21
 [11] jacobian(f::Function, x::Vector{Float64}, cfg::ForwardDiff.JacobianConfig{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 3, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#5#6", Float64}, Float64, 3}}}) (repeats 2 times)
    @ ForwardDiff ~/.julia/packages/ForwardDiff/QOqCN/src/jacobian.jl:19
 [12] top-level scope
    @ ~/.julia/dev/PowerSimulationsDynamics/_dev.jl:356
in expression starting at /Users/jdlara/.julia/dev/PowerSimulationsDynamics/_dev.jl:356

Any guidance is appreciated

i noted that the common and u vectors, while duals, are not of the same, type. common has an element type of Dual{Nothing...}, where as u in the jacobian evaluation has the element type of Dual{#32...}, where #32 is the tag of the anonymous function passed to ForwardDiff.jacobian

changing component1

function component1(u, common)
   # The code fails in this line
   common[1] = one(common[1])*u[1]
   return -0.1 * u[1]
end

shows a Dual tag ordering error.

an strategy i would try would be:

  1. using a vector type to store du and common in one struct (ComponentArrays.jl or LabelledArrays.jl)
  2. build a ForwardDiff.JacobianConfig to store the duals and reuse those.
  3. Overloading ForwardDiff.Jacobian(model::modelf,du::V) to use the stored JacobianConfig instead of generating a new one

You’re missing all of the reinterpret handling. Why not just make the dual_cache be one of the fields of the function?

BTW, this whole struct is not concretely typed and so it will be slow. And you might not want to <: Function for specialization.

1 Like

Thanks for the answer, I was just trying to follow the pattern in the AD advanced examples thread by using the functor. Is the re-interpret done in DiffEq automatically if I were to use a dualcache instead of vectors of duals in the struct?

Yeah, the struct isn’t optimized it is just an MWE in the actual implementation it will be handled correctly.

The solution to this problem was using a call to reinterpret as @ChrisRackauckas suggested in the getter functions. Here is a code that works.

Be aware this is not optimized code.

using ForwardDiff
using Random

mutable struct modelf <: Function
    common::Vector
    common_dual::Vector
    du::Vector
    du_dual::Vector
    function modelf(du0::Vector{T}) where T <: Number
        # Forced the CHUNK to 3 for the MWE.
        U = ForwardDiff.Dual{nothing, T, 3}
        new(zeros(T, 3), zeros(U, 3), similar(du0), zeros(U, length(du0)))
    end
end

get_common(m::modelf, ::Type{Float64}) = m.common
get_common(m::modelf, ::Type{T}) where T <: ForwardDiff.Dual = reinterpret(T, m.common_dual)
get_du(m::modelf, ::Type{Float64}) = m.du
get_du(m::modelf, ::Type{T}) where T <: ForwardDiff.Dual = reinterpret(T, m.du_dual)

function (f::modelf)(u::AbstractArray{T}, t) where T
    system(get_du(f, T), u, get_common(f, T), t)
    return get_du(f, T)
end

function component1(u, common) where T
    common[1] = 1.0*u[1]
    return -0.1 * u[1]
end

function component2(u, common) where T
    common[2] = -0.8 * u[2]
    return -0.4 * u[2]
end

function component3(u, common) where T
    return sum(common .* [-0.1, 0.2, 0.01]) + 0.02 * u[3]
end

function system(du, u, common, t) where T
    du[1] = component1(u, common)
    du[2] = component2(u, common)
    du[3] = component3(u, common)
    nothing
end

model = modelf(zeros(3))
model(zeros(3), 0)

fu = (u) -> model(u, 0)
jconfig = ForwardDiff.JacobianConfig(fu , rand(3), ForwardDiff.Chunk{3}())
J = ForwardDiff.jacobian(fu, rand(3), jconfig)
1 Like