Autodiff working for my model function but not for my loss function

Vitor_Patricio_Canta · July 31, 2024, 8:18am

Hi,
I have a function that calculates the analytical solution to a pulse injection through a column (advective dispersion transport). I am trying to fit some parameters to the data.
I manage to calculate the derivative of my pulse_injection function but not the derivative of the loss function. I have tried a bunch of different things without success and I was hoping someone could spot my mistake here. Thanks a lot!

This is my function:

function pulse_injection(
    cr::Matrix{T},
    x::Vector{T},
    t::Vector{T},
    c0,
    c_in,
    v,
    Dl,
    t_pulse
    ) where T
    #cr = zeros(length(t), length(x))
    ratio = (c0 - c_in) / 2
    @inbounds for i=1:length(x), j=1:length(t)
        exp_term = exp(v .* x[i] / Dl)
        cr[j, i] = c_in .+ ratio .* (erfc.((x[i] .- v .* t[j])
         ./ (2 .* sqrt.(Dl .* t[j]))) .+ exp_term
         .* erfc.((x[i] .+ v .* t[j]) ./ (2 .* sqrt.(Dl .* t[j]))))
        if t[j] >= t_pulse
            co = - ratio .* (erfc.((x[i] .- v .* (t[j] .- t_pulse))
            ./ (2 .* sqrt.(Dl .* (t[j] .- t_pulse)))) .+ exp_term
            .* erfc.((x[i] .+ v .* (t[j] .- t_pulse)) ./ (2 .* sqrt.(Dl .* (t[j] .- t_pulse)))))
            cr[j, i] = cr[j, i] .+ co
        end
    end
    return nothing
end

This is my loss function:

function loss_julia(p::Vector)
    phi = p[1]
    alpha_l = p[2]
    vp = q / phi
    De = 2.01e-9 + alpha_l * vp
    pulse_injection(c_model, x, t, c0, c_in, vp, De, t_pulse)
    return sum((c_model[:, 1] .- c).^2)
end

I can run the loss function:

loss_julia([0.3, 1e-3])

and calculate the jacobian of the pulse_injection function:

ForwardDiff.jacobian((c_model, p)-> pulse_injection(c_model, x, t, c0, c_in, q/p[1], 2.01e-9 + p[2]*q/p[1], t_pulse), c_model, [0.3, 1e-3])

but not from the loss function:

ForwardDiff.jacobian(loss_julia, [0.3, 1e-3])

I get this error:

ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(loss_julia), Float64}, Float64, 2})```

Full code snippet:
```julia
using SpecialFunctions
using BenchmarkTools
using Plots
using Optim
using OptimizationOptimJL
using Optimization
using SciMLSensitivity
using CSV
using DataFrames
using ForwardDiff

function pulse_injection(
    cr::Matrix{T},
    x::Vector{T},
    t::Vector{T},
    c0,
    c_in,
    v,
    Dl,
    t_pulse
    ) where T
    #cr = zeros(length(t), length(x))
    ratio = (c0 - c_in) / 2
    @inbounds for i=1:length(x), j=1:length(t)
        exp_term = exp(v .* x[i] / Dl)
        cr[j, i] = c_in .+ ratio .* (erfc.((x[i] .- v .* t[j])
         ./ (2 .* sqrt.(Dl .* t[j]))) .+ exp_term
         .* erfc.((x[i] .+ v .* t[j]) ./ (2 .* sqrt.(Dl .* t[j]))))
        if t[j] >= t_pulse
            co = - ratio .* (erfc.((x[i] .- v .* (t[j] .- t_pulse))
            ./ (2 .* sqrt.(Dl .* (t[j] .- t_pulse)))) .+ exp_term
            .* erfc.((x[i] .+ v .* (t[j] .- t_pulse)) ./ (2 .* sqrt.(Dl .* (t[j] .- t_pulse)))))
            cr[j, i] = cr[j, i] .+ co
        end
    end
    return nothing
end

using CSV
using DataFrames
#load data
data = CSV.read("data/data_a.csv",DataFrame)
global t = data[:,1]
global c = data[:,2]
global c0 = 2.0
global c_in = 0.0

global t_pulse = 3600.0
global x = [0.121]
Q0_ml = 6 #ml/hr
Q0 = Q0_ml*1e-6/3600 # m3/s
diam = 0.037 # diameter of the column [m]
area = pi*(diam/2)^2 # cross-sectional area of the column [m2]
global q = Q0/area # specific discharge [m/s]

c_model = zeros(length(t), length(x))
function loss_julia(p::Vector{T}) where T
    phi = p[1]
    alpha_l = p[2]
    vp = q / phi
    De = 2.01e-9 + alpha_l * vp
    pulse_injection(c_model, x, t, c0, c_in, vp, De, t_pulse)
    return sum((c_model[:, 1] .- c).^2)
end

loss_julia([0.3, 1e-3])

ForwardDiff.jacobian((c_model, p)-> pulse_injection(c_model, x, t, c0, c_in, q/p[1], 2.01e-9 + p[2]*q/p[1], t_pulse), c_model, [0.3, 1e-3])
ForwardDiff.jacobian(loss_julia, [0.3, 1e-3])

nilshg · July 31, 2024, 8:29am

global c = data[:,2]
c_model = zeros(length(t), length(x))
function loss_julia(p::Vector{T}) where T
    (...)
    pulse_injection(c_model, x, t, c0, c_in, vp, De, t_pulse)
    return sum((c_model[:, 1] .- c).^2)
end

This is not good:

Your function depends on global variables, this should be avoided almost always
It looks like your pulse_injection function mutates its input arguments, conventionally this is denoted in Julia by appending a bang to the function name (pulse_injection!)
You fix the type of c_model to be Array{Float64} in your zeros call. You want zeros(eltype(z), length(t), length(x)) if you are using this array in autodiff code, as it will need to accomodate dual numbers:

julia> using ForwardDiff

julia> x = zeros(3)
3-element Vector{Float64}:
 0.0
 0.0
 0.0

julia> x[1] = ForwardDiff.Dual(1)
ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{Nothing, Int64, 0})

Vitor_Patricio_Canta · July 31, 2024, 8:45am

Thanks! That was fast. Sorry. I have more questions. In relation to your topics:

I guess keeping them as normal variables should be no problem, right? My concern is that I want the parameters to be available to the loss_function without calling them explicitly. Would suggest a different approach.
True. Thanks for the tips/
You suggest zeros(eltype(z), length(t), length(x)). What would z be in this case, or in general?

nilshg · July 31, 2024, 9:12am

I’m not sure what you mean by “normal variables” here. The key is that all variables in your function are actually arguments to that function. If you need a one-arg version of your function for optimization, the usual pattern is to create a closure like

loss(a, b, c) = a*b^c
b_val = 1; c_val = 2
optimize(z -> loss(z, b_val, c_val), ...)

z is the input to your loss function which you are optimizing over and which therefore will become a Dual during autodiff.

Vitor_Patricio_Canta · July 31, 2024, 11:44am

Thanks a lot @nilshg

I could make it work by creating the mutated variable inside the function with the parameters signature as you have mentioned above:

function loss_julia(p, x, t, c0, c_in, q, De, t_pulse)
    phi = p[1]
    alpha_l = p[2]
    vp = q / phi
    Dl = De + alpha_l * vp
    c_model = zeros(eltype(p),length(t), length(x))
    pulse_injection!(c_model, x, t, c0, c_in, vp, Dl, t_pulse)
    return sum((@view(c_model[:, 1]) - c).^2)
end

Topic		Replies	Views
Gradient of loss function from ODE solution General Usage question , autodiff	1	194	May 7, 2024
Forward differentiation and differential equations Modelling & Simulations diffeq , differentiation , optim , optimization	13	4421	November 7, 2023
Forward mode Autodiff for Matrix multiplication General Usage autodiff	3	770	November 25, 2021
Autograd through loss function with derivatives in Machine Learning	1	879	March 25, 2021
Getting ForwardDiff.Dual to propagate through function New to Julia optim , forwarddiff , autodiff	5	730	December 13, 2023

Autodiff working for my model function but not for my loss function

Related topics