ForwardDiff and FunctionWrappers

DanielVandH · December 2, 2022, 7:41am

I have code that makes use of FunctionWrappers in some places. I need to compute gradients for code that involves these wrappers - is this possible? For example,

using FunctionWrappers
using ForwardDiff
f1 = FunctionWrapper{Float64, NTuple{2, Float64}}((x, y) -> x + 1)
f2 = FunctionWrapper{Float64, Tuple{Vector{Float64}}}(x -> x[1] * x[2])
f3 = FunctionWrapper{Float64, Tuple{Vector{Float64}}}(x -> sin(x[1]*x[2]))
f = [f1,f2,f3]
function example_function(u; f)
    S = f[1](u[1], u[2]) + f[2](u) + f[3](u)
    return S 
end
u = [2.4, 7.2]
example_function(u; f)
ForwardDiff.gradient(u -> example_function(u; f), u)

ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2})
Closest candidates are:
  (::Type{T})(::Real, ::RoundingMode) where T<:AbstractFloat at rounding.jl:200
  (::Type{T})(::T) where T<:Number at boot.jl:772
  (::Type{T})(::AbstractChar) where T<:Union{AbstractChar, Number} at char.jl:50
  ...
Stacktrace:
  [1] convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2})
    @ Base .\number.jl:7
  [2] macro expansion
    @ C:\Users\licer\.julia\packages\FunctionWrappers\Q5cBx\src\FunctionWrappers.jl:137 [inlined]
  [3] do_ccall(f::FunctionWrapper{Float64, Tuple{Float64, Float64}}, args::Tuple{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}, ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}})
    @ FunctionWrappers C:\Users\licer\.julia\packages\FunctionWrappers\Q5cBx\src\FunctionWrappers.jl:125
  [4] FunctionWrapper
    @ C:\Users\licer\.julia\packages\FunctionWrappers\Q5cBx\src\FunctionWrappers.jl:144 [inlined]
  [5] example_function(u::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}}; f::Vector{FunctionWrapper{Float64}})
    @ Main .\Untitled-1:43
  [6] (::var"#36#37")(u::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}})
    @ Main .\Untitled-1:48
  [7] vector_mode_dual_eval!
    @ C:\Users\licer\.julia\packages\ForwardDiff\eqMFf\src\apiutils.jl:37 [inlined]
  [8] vector_mode_gradient(f::var"#36#37", x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}}})
    @ ForwardDiff C:\Users\licer\.julia\packages\ForwardDiff\eqMFf\src\gradient.jl:106
  [9] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}}}, ::Val{true})
    @ ForwardDiff C:\Users\licer\.julia\packages\ForwardDiff\eqMFf\src\gradient.jl:19
 [10] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#36#37", Float64}, Float64, 2}}}) (repeats 2 times)
    @ ForwardDiff C:\Users\licer\.julia\packages\ForwardDiff\eqMFf\src\gradient.jl:17
 [11] top-level scope
    @ Untitled-1:48

The actual application is for solving ODEs with DifferentialEquations.jl where the system does require the evaluation of some of these wrappers,

ChrisRackauckas · December 2, 2022, 9:57am

Easiest thing: set autodiff=false in the solver. It doesn’t make the hugest difference to performance anyways.

More time consuming: using FunctionWrappersWrappers.jl like how DiffEqBase does to support all of the Dual combinations.

github.com

SciML/DiffEqBase.jl/blob/v6.109.0/src/norecompile.jl#L3-L73


      
          const dualT = ForwardDiff.Dual{ForwardDiff.Tag{OrdinaryDiffEqTag, Float64}, Float64, 1}
          
          
dualgen(::Type{T}) where {T} = ForwardDiff.Dual{ForwardDiff.Tag{OrdinaryDiffEqTag, T}, T, 1}
          
          
const NORECOMPILE_IIP_SUPPORTED_ARGS = (Tuple{Vector{Float64}, Vector{Float64},
                                                        Vector{Float64}, Float64},
                                                  Tuple{Vector{Float64}, Vector{Float64},
                                                        SciMLBase.NullParameters, Float64})
          
          
const oop_arglists = (Tuple{Vector{Float64}, Vector{Float64}, Float64},
                                Tuple{Vector{Float64}, SciMLBase.NullParameters, Float64},
                                Tuple{Vector{Float64}, Vector{Float64}, dualT},
                                Tuple{Vector{dualT}, Vector{Float64}, Float64},
                                Tuple{Vector{dualT}, SciMLBase.NullParameters, Float64},
                                Tuple{Vector{Float64}, SciMLBase.NullParameters, dualT})
          
          
const NORECOMPILE_OOP_SUPPORTED_ARGS = (Tuple{Vector{Float64},
                                                        Vector{Float64}, Float64},
                                                  Tuple{Vector{Float64},
                                                        SciMLBase.NullParameters, Float64})

This file has been truncated. show original

DanielVandH · December 2, 2022, 10:01am

Thanks @ChrisRackauckas. I had looked at previously looked at FunctionWrappersWrappers.jl, but I struggled to really understand much of what was going on even when looking at the test. That snippet looks useful once I’ve understood it.

I’m interested that you say autodiff might not provide the biggest performance boost: Would this hold true even if e.g. I’m considering a (sparse) system with ~5-10,000 ODEs?

ChrisRackauckas · December 2, 2022, 10:05am

It can be, sometimes, if you do SciMLBase.FullSpecialize and allow a high chunk size. Even then, it’s not the biggest improvement. It’s like, 2x or 3x. At a single chunk it’s very minimal though. Forward mode for Jacobians is more about the numerical stability aspects.

DanielVandH · December 3, 2022, 5:14am

Thanks for that. I went with something like this for allowing differentiation:

using DiffEqBase, FunctionWrappersWrappers
import DiffEqBase: dualgen
# Define method for wrapping functions 
get_dual_arg_types(::Type{T}, ::Type{U}, ::Type{P}) where {T,U,P} = (
    Tuple{T,T,T,U,P},                       # Typical signature 
    Tuple{T,T,T,dualgen(U),P},              # Signature with "u" a Dual 
    Tuple{T,T,dualgen(T),U,P},              # Signature with "t" a Dual 
    Tuple{T,T,dualgen(T),dualgen(U),P})     # Signature with "u" and "t" Duals
get_dual_ret_types(::Type{U}, ::Type{T}) where {U,T} = (U, dualgen(U), dualgen(T), dualgen(promote_type(U, T)))
function wrap_functions(functions, parameters::P; float_type::Type{T}=Float64, u_type::Type{U}=Float64) where {T,U,P}
    all_arg_types = ntuple(i -> get_dual_arg_types(T, U, typeof(parameters[i])), length(parameters))
    all_ret_types = ntuple(i -> get_dual_ret_types(U, T), length(parameters))
    wrapped_functions = ntuple(i -> FunctionWrappersWrapper(functions[i], all_arg_types[i], all_ret_types[i]), length(parameters))
    return wrapped_functions
end

using OrdinaryDiffEq
# Define an example ODE to be differentiated
function sysde!(du, u, p, t)
    wrapped_functions, parameters, x, y = p
    du[1] = wrapped_functions[1](x, y, t, u[1], parameters[1]) + wrapped_functions[1](x, y, t, u[2], parameters[1])
    du[2] = wrapped_functions[2](x, y, t, u[2], parameters[2]) 
    du[3] = 5.0 + u[1] * u[2] + u[3] * wrapped_functions[2](x, y, t, u[3], parameters[2])
    return nothing
end

# Solve
f1 = (x, y, t, u, p) -> u + x * y
f2 = (x, y, t, u, p) -> u + u^2 * p[1] + t
parameters = (nothing, 1.0)
wrapped_functions = wrap_functions((f1, f2), parameters)
x, y = rand(2)
u0 = rand(3)
tspan = (0.0,1.0)
ode_p = (wrapped_functions, parameters, x, y)
prob = ODEProblem(sysde!, u0, tspan, ode_p)
sol = solve(prob, Rosenbrock23(autodiff=true))

@code_warntype sysde!(rand(3), rand(3), ode_p, 0.0)

Seems to work fine. Not sure if it’s the best way but I think it works.

allow a high chunk size

Can this be done easily with the way I’ve used dualgen above? How is the chunk size set / what is “high”?

ChrisRackauckas · December 3, 2022, 9:42am

What’s the signature of the dual you’re choosing and what’s the chunk size?

DanielVandH · December 3, 2022, 10:32am

I’m not entirely sure - I’ve just taken the dualgen function from the DiffEqBase code you linked, where

dualgen(::Type{T}) where {T} = ForwardDiff.Dual{ForwardDiff.Tag{OrdinaryDiffEqTag, T}, T, 1}

I’m not too familiar with these details about ForwardDiff.jl, I think I’ve always just used what the chunk size defaults to without knowing what I should be setting it to.

ChrisRackauckas · December 3, 2022, 11:02am

That’s chunk size 1. That’s cutting a bit of performance to make this a bit simpler to setup.

DanielVandH · December 3, 2022, 11:21am

I see. In my case these are all scalars, so I suppose 1 is all I need! Thanks for your help and patience - very helpful as always!

Topic		Replies	Views
ForwardDiff failing on simple polynomial New to Julia	1	939	September 28, 2017
Problem with ForardDiff New to Julia package , forwarddiff	1	243	June 8, 2022
ForwardDiff: Using on functions that are defined on floats64 General Usage	8	2465	December 2, 2018
ForwardDiff.gradient of function with nested functions calls General Usage question , package	2	816	September 21, 2018
Need help using ForwardDiff gradient with numerical loop General Usage question	11	156	November 7, 2024

ForwardDiff and FunctionWrappers

Related topics