# ForwardDiff and FunctionWrappers

I have code that makes use of FunctionWrappers in some places. I need to compute gradients for code that involves these wrappers - is this possible? For example,

``````using FunctionWrappers
using ForwardDiff
f1 = FunctionWrapper{Float64, NTuple{2, Float64}}((x, y) -> x + 1)
f2 = FunctionWrapper{Float64, Tuple{Vector{Float64}}}(x -> x[1] * x[2])
f3 = FunctionWrapper{Float64, Tuple{Vector{Float64}}}(x -> sin(x[1]*x[2]))
f = [f1,f2,f3]
function example_function(u; f)
S = f[1](u[1], u[2]) + f[2](u) + f[3](u)
return S
end
u = [2.4, 7.2]
example_function(u; f)
``````
``````ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2})
Closest candidates are:
(::Type{T})(::Real, ::RoundingMode) where T<:AbstractFloat at rounding.jl:200
(::Type{T})(::T) where T<:Number at boot.jl:772
(::Type{T})(::AbstractChar) where T<:Union{AbstractChar, Number} at char.jl:50
...
Stacktrace:
[1] convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2})
@ Base .\number.jl:7
[2] macro expansion
@ C:\Users\licer\.julia\packages\FunctionWrappers\Q5cBx\src\FunctionWrappers.jl:137 [inlined]
[3] do_ccall(f::FunctionWrapper{Float64, Tuple{Float64, Float64}}, args::Tuple{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}, ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}})
@ FunctionWrappers C:\Users\licer\.julia\packages\FunctionWrappers\Q5cBx\src\FunctionWrappers.jl:125
[4] FunctionWrapper
@ C:\Users\licer\.julia\packages\FunctionWrappers\Q5cBx\src\FunctionWrappers.jl:144 [inlined]
[5] example_function(u::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}}; f::Vector{FunctionWrapper{Float64}})
@ Main .\Untitled-1:43
[6] (::var"#36#37")(u::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}})
@ Main .\Untitled-1:48
[7] vector_mode_dual_eval!
@ C:\Users\licer\.julia\packages\ForwardDiff\eqMFf\src\apiutils.jl:37 [inlined]
[9] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}}}, ::Val{true})
[10] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}}}) (repeats 2 times)
[11] top-level scope
@ Untitled-1:48
``````

The actual application is for solving ODEs with DifferentialEquations.jl where the system does require the evaluation of some of these wrappers,

Easiest thing: set `autodiff=false` in the solver. It doesnâ€™t make the hugest difference to performance anyways.

More time consuming: using FunctionWrappersWrappers.jl like how DiffEqBase does to support all of the Dual combinations.

1 Like

Thanks @ChrisRackauckas. I had looked at previously looked at FunctionWrappersWrappers.jl, but I struggled to really understand much of what was going on even when looking at the test. That snippet looks useful once Iâ€™ve understood it.

Iâ€™m interested that you say autodiff might not provide the biggest performance boost: Would this hold true even if e.g. Iâ€™m considering a (sparse) system with ~5-10,000 ODEs?

It can be, sometimes, if you do `SciMLBase.FullSpecialize` and allow a high chunk size. Even then, itâ€™s not the biggest improvement. Itâ€™s like, 2x or 3x. At a single chunk itâ€™s very minimal though. Forward mode for Jacobians is more about the numerical stability aspects.

1 Like

Thanks for that. I went with something like this for allowing differentiation:

``````using DiffEqBase, FunctionWrappersWrappers
import DiffEqBase: dualgen
# Define method for wrapping functions
get_dual_arg_types(::Type{T}, ::Type{U}, ::Type{P}) where {T,U,P} = (
Tuple{T,T,T,U,P},                       # Typical signature
Tuple{T,T,T,dualgen(U),P},              # Signature with "u" a Dual
Tuple{T,T,dualgen(T),U,P},              # Signature with "t" a Dual
Tuple{T,T,dualgen(T),dualgen(U),P})     # Signature with "u" and "t" Duals
get_dual_ret_types(::Type{U}, ::Type{T}) where {U,T} = (U, dualgen(U), dualgen(T), dualgen(promote_type(U, T)))
function wrap_functions(functions, parameters::P; float_type::Type{T}=Float64, u_type::Type{U}=Float64) where {T,U,P}
all_arg_types = ntuple(i -> get_dual_arg_types(T, U, typeof(parameters[i])), length(parameters))
all_ret_types = ntuple(i -> get_dual_ret_types(U, T), length(parameters))
wrapped_functions = ntuple(i -> FunctionWrappersWrapper(functions[i], all_arg_types[i], all_ret_types[i]), length(parameters))
return wrapped_functions
end

using OrdinaryDiffEq
# Define an example ODE to be differentiated
function sysde!(du, u, p, t)
wrapped_functions, parameters, x, y = p
du[1] = wrapped_functions[1](x, y, t, u[1], parameters[1]) + wrapped_functions[1](x, y, t, u[2], parameters[1])
du[2] = wrapped_functions[2](x, y, t, u[2], parameters[2])
du[3] = 5.0 + u[1] * u[2] + u[3] * wrapped_functions[2](x, y, t, u[3], parameters[2])
return nothing
end

# Solve
f1 = (x, y, t, u, p) -> u + x * y
f2 = (x, y, t, u, p) -> u + u^2 * p[1] + t
parameters = (nothing, 1.0)
wrapped_functions = wrap_functions((f1, f2), parameters)
x, y = rand(2)
u0 = rand(3)
tspan = (0.0,1.0)
ode_p = (wrapped_functions, parameters, x, y)
prob = ODEProblem(sysde!, u0, tspan, ode_p)
sol = solve(prob, Rosenbrock23(autodiff=true))

@code_warntype sysde!(rand(3), rand(3), ode_p, 0.0)
``````

Seems to work fine. Not sure if itâ€™s the best way but I think it works.

allow a high chunk size

Can this be done easily with the way Iâ€™ve used `dualgen` above? How is the chunk size set / what is â€śhighâ€ť?

Whatâ€™s the signature of the dual youâ€™re choosing and whatâ€™s the chunk size?

Iâ€™m not entirely sure - Iâ€™ve just taken the `dualgen` function from the DiffEqBase code you linked, where

``````dualgen(::Type{T}) where {T} = ForwardDiff.Dual{ForwardDiff.Tag{OrdinaryDiffEqTag, T}, T, 1}
``````

Iâ€™m not too familiar with these details about ForwardDiff.jl, I think Iâ€™ve always just used what the chunk size defaults to without knowing what I should be setting it to.

Thatâ€™s chunk size 1. Thatâ€™s cutting a bit of performance to make this a bit simpler to setup.

1 Like

I see. In my case these are all scalars, so I suppose 1 is all I need! Thanks for your help and patience - very helpful as always!

1 Like