ForwardDiff and FunctionWrappers

I have code that makes use of FunctionWrappers in some places. I need to compute gradients for code that involves these wrappers - is this possible? For example,

using FunctionWrappers
using ForwardDiff
f1 = FunctionWrapper{Float64, NTuple{2, Float64}}((x, y) -> x + 1)
f2 = FunctionWrapper{Float64, Tuple{Vector{Float64}}}(x -> x[1] * x[2])
f3 = FunctionWrapper{Float64, Tuple{Vector{Float64}}}(x -> sin(x[1]*x[2]))
f = [f1,f2,f3]
function example_function(u; f)
    S = f[1](u[1], u[2]) + f[2](u) + f[3](u)
    return S 
u = [2.4, 7.2]
example_function(u; f)
ForwardDiff.gradient(u -> example_function(u; f), u)
ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2})
Closest candidates are:
  (::Type{T})(::Real, ::RoundingMode) where T<:AbstractFloat at rounding.jl:200
  (::Type{T})(::T) where T<:Number at boot.jl:772
  (::Type{T})(::AbstractChar) where T<:Union{AbstractChar, Number} at char.jl:50
  [1] convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2})
    @ Base .\number.jl:7
  [2] macro expansion
    @ C:\Users\licer\.julia\packages\FunctionWrappers\Q5cBx\src\FunctionWrappers.jl:137 [inlined]
  [3] do_ccall(f::FunctionWrapper{Float64, Tuple{Float64, Float64}}, args::Tuple{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}, ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}})
    @ FunctionWrappers C:\Users\licer\.julia\packages\FunctionWrappers\Q5cBx\src\FunctionWrappers.jl:125
  [4] FunctionWrapper
    @ C:\Users\licer\.julia\packages\FunctionWrappers\Q5cBx\src\FunctionWrappers.jl:144 [inlined]
  [5] example_function(u::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}}; f::Vector{FunctionWrapper{Float64}})
    @ Main .\Untitled-1:43
  [6] (::var"#36#37")(u::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}})
    @ Main .\Untitled-1:48
  [7] vector_mode_dual_eval!
    @ C:\Users\licer\.julia\packages\ForwardDiff\eqMFf\src\apiutils.jl:37 [inlined]
  [8] vector_mode_gradient(f::var"#36#37", x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}}})
    @ ForwardDiff C:\Users\licer\.julia\packages\ForwardDiff\eqMFf\src\gradient.jl:106
  [9] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}}}, ::Val{true})
    @ ForwardDiff C:\Users\licer\.julia\packages\ForwardDiff\eqMFf\src\gradient.jl:19
 [10] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}}}) (repeats 2 times)
    @ ForwardDiff C:\Users\licer\.julia\packages\ForwardDiff\eqMFf\src\gradient.jl:17
 [11] top-level scope
    @ Untitled-1:48

The actual application is for solving ODEs with DifferentialEquations.jl where the system does require the evaluation of some of these wrappers,

Easiest thing: set autodiff=false in the solver. It doesn’t make the hugest difference to performance anyways.

More time consuming: using FunctionWrappersWrappers.jl like how DiffEqBase does to support all of the Dual combinations.

1 Like

Thanks @ChrisRackauckas. I had looked at previously looked at FunctionWrappersWrappers.jl, but I struggled to really understand much of what was going on even when looking at the test. That snippet looks useful once I’ve understood it.

I’m interested that you say autodiff might not provide the biggest performance boost: Would this hold true even if e.g. I’m considering a (sparse) system with ~5-10,000 ODEs?

It can be, sometimes, if you do SciMLBase.FullSpecialize and allow a high chunk size. Even then, it’s not the biggest improvement. It’s like, 2x or 3x. At a single chunk it’s very minimal though. Forward mode for Jacobians is more about the numerical stability aspects.

1 Like

Thanks for that. I went with something like this for allowing differentiation:

using DiffEqBase, FunctionWrappersWrappers
import DiffEqBase: dualgen
# Define method for wrapping functions 
get_dual_arg_types(::Type{T}, ::Type{U}, ::Type{P}) where {T,U,P} = (
    Tuple{T,T,T,U,P},                       # Typical signature 
    Tuple{T,T,T,dualgen(U),P},              # Signature with "u" a Dual 
    Tuple{T,T,dualgen(T),U,P},              # Signature with "t" a Dual 
    Tuple{T,T,dualgen(T),dualgen(U),P})     # Signature with "u" and "t" Duals
get_dual_ret_types(::Type{U}, ::Type{T}) where {U,T} = (U, dualgen(U), dualgen(T), dualgen(promote_type(U, T)))
function wrap_functions(functions, parameters::P; float_type::Type{T}=Float64, u_type::Type{U}=Float64) where {T,U,P}
    all_arg_types = ntuple(i -> get_dual_arg_types(T, U, typeof(parameters[i])), length(parameters))
    all_ret_types = ntuple(i -> get_dual_ret_types(U, T), length(parameters))
    wrapped_functions = ntuple(i -> FunctionWrappersWrapper(functions[i], all_arg_types[i], all_ret_types[i]), length(parameters))
    return wrapped_functions

using OrdinaryDiffEq
# Define an example ODE to be differentiated
function sysde!(du, u, p, t)
    wrapped_functions, parameters, x, y = p
    du[1] = wrapped_functions[1](x, y, t, u[1], parameters[1]) + wrapped_functions[1](x, y, t, u[2], parameters[1])
    du[2] = wrapped_functions[2](x, y, t, u[2], parameters[2]) 
    du[3] = 5.0 + u[1] * u[2] + u[3] * wrapped_functions[2](x, y, t, u[3], parameters[2])
    return nothing

# Solve
f1 = (x, y, t, u, p) -> u + x * y
f2 = (x, y, t, u, p) -> u + u^2 * p[1] + t
parameters = (nothing, 1.0)
wrapped_functions = wrap_functions((f1, f2), parameters)
x, y = rand(2)
u0 = rand(3)
tspan = (0.0,1.0)
ode_p = (wrapped_functions, parameters, x, y)
prob = ODEProblem(sysde!, u0, tspan, ode_p)
sol = solve(prob, Rosenbrock23(autodiff=true))

@code_warntype sysde!(rand(3), rand(3), ode_p, 0.0)

Seems to work fine. Not sure if it’s the best way but I think it works.

allow a high chunk size

Can this be done easily with the way I’ve used dualgen above? How is the chunk size set / what is “high”?

What’s the signature of the dual you’re choosing and what’s the chunk size?

I’m not entirely sure - I’ve just taken the dualgen function from the DiffEqBase code you linked, where

dualgen(::Type{T}) where {T} = ForwardDiff.Dual{ForwardDiff.Tag{OrdinaryDiffEqTag, T}, T, 1}

I’m not too familiar with these details about ForwardDiff.jl, I think I’ve always just used what the chunk size defaults to without knowing what I should be setting it to.

That’s chunk size 1. That’s cutting a bit of performance to make this a bit simpler to setup.

1 Like

I see. In my case these are all scalars, so I suppose 1 is all I need! Thanks for your help and patience - very helpful as always!

1 Like