Autodiff working for my model function but not for my loss function

Hi,
I have a function that calculates the analytical solution to a pulse injection through a column (advective dispersion transport). I am trying to fit some parameters to the data.
I manage to calculate the derivative of my pulse_injection function but not the derivative of the loss function. I have tried a bunch of different things without success and I was hoping someone could spot my mistake here. Thanks a lot!

This is my function:

function pulse_injection(
    cr::Matrix{T},
    x::Vector{T},
    t::Vector{T},
    c0,
    c_in,
    v,
    Dl,
    t_pulse
    ) where T
    #cr = zeros(length(t), length(x))
    ratio = (c0 - c_in) / 2
    @inbounds for i=1:length(x), j=1:length(t)
        exp_term = exp(v .* x[i] / Dl)
        cr[j, i] = c_in .+ ratio .* (erfc.((x[i] .- v .* t[j])
         ./ (2 .* sqrt.(Dl .* t[j]))) .+ exp_term
         .* erfc.((x[i] .+ v .* t[j]) ./ (2 .* sqrt.(Dl .* t[j]))))
        if t[j] >= t_pulse
            co = - ratio .* (erfc.((x[i] .- v .* (t[j] .- t_pulse))
            ./ (2 .* sqrt.(Dl .* (t[j] .- t_pulse)))) .+ exp_term
            .* erfc.((x[i] .+ v .* (t[j] .- t_pulse)) ./ (2 .* sqrt.(Dl .* (t[j] .- t_pulse)))))
            cr[j, i] = cr[j, i] .+ co
        end
    end
    return nothing
end

This is my loss function:

function loss_julia(p::Vector)
    phi = p[1]
    alpha_l = p[2]
    vp = q / phi
    De = 2.01e-9 + alpha_l * vp
    pulse_injection(c_model, x, t, c0, c_in, vp, De, t_pulse)
    return sum((c_model[:, 1] .- c).^2)
end

I can run the loss function:

loss_julia([0.3, 1e-3])

and calculate the jacobian of the pulse_injection function:

ForwardDiff.jacobian((c_model, p)-> pulse_injection(c_model, x, t, c0, c_in, q/p[1], 2.01e-9 + p[2]*q/p[1], t_pulse), c_model, [0.3, 1e-3])

but not from the loss function:

ForwardDiff.jacobian(loss_julia, [0.3, 1e-3])

I get this error:

ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(loss_julia), Float64}, Float64, 2})```

Full code snippet:
```julia
using SpecialFunctions
using BenchmarkTools
using Plots
using Optim
using OptimizationOptimJL
using Optimization
using SciMLSensitivity
using CSV
using DataFrames
using ForwardDiff

function pulse_injection(
    cr::Matrix{T},
    x::Vector{T},
    t::Vector{T},
    c0,
    c_in,
    v,
    Dl,
    t_pulse
    ) where T
    #cr = zeros(length(t), length(x))
    ratio = (c0 - c_in) / 2
    @inbounds for i=1:length(x), j=1:length(t)
        exp_term = exp(v .* x[i] / Dl)
        cr[j, i] = c_in .+ ratio .* (erfc.((x[i] .- v .* t[j])
         ./ (2 .* sqrt.(Dl .* t[j]))) .+ exp_term
         .* erfc.((x[i] .+ v .* t[j]) ./ (2 .* sqrt.(Dl .* t[j]))))
        if t[j] >= t_pulse
            co = - ratio .* (erfc.((x[i] .- v .* (t[j] .- t_pulse))
            ./ (2 .* sqrt.(Dl .* (t[j] .- t_pulse)))) .+ exp_term
            .* erfc.((x[i] .+ v .* (t[j] .- t_pulse)) ./ (2 .* sqrt.(Dl .* (t[j] .- t_pulse)))))
            cr[j, i] = cr[j, i] .+ co
        end
    end
    return nothing
end

using CSV
using DataFrames
#load data
data = CSV.read("data/data_a.csv",DataFrame)
global t = data[:,1]
global c = data[:,2]
global c0 = 2.0
global c_in = 0.0

global t_pulse = 3600.0
global x = [0.121]
Q0_ml = 6 #ml/hr
Q0 = Q0_ml*1e-6/3600 # m3/s
diam = 0.037 # diameter of the column [m]
area = pi*(diam/2)^2 # cross-sectional area of the column [m2]
global q = Q0/area # specific discharge [m/s]

c_model = zeros(length(t), length(x))
function loss_julia(p::Vector{T}) where T
    phi = p[1]
    alpha_l = p[2]
    vp = q / phi
    De = 2.01e-9 + alpha_l * vp
    pulse_injection(c_model, x, t, c0, c_in, vp, De, t_pulse)
    return sum((c_model[:, 1] .- c).^2)
end

loss_julia([0.3, 1e-3])

ForwardDiff.jacobian((c_model, p)-> pulse_injection(c_model, x, t, c0, c_in, q/p[1], 2.01e-9 + p[2]*q/p[1], t_pulse), c_model, [0.3, 1e-3])
ForwardDiff.jacobian(loss_julia, [0.3, 1e-3])
global c = data[:,2]
c_model = zeros(length(t), length(x))
function loss_julia(p::Vector{T}) where T
    (...)
    pulse_injection(c_model, x, t, c0, c_in, vp, De, t_pulse)
    return sum((c_model[:, 1] .- c).^2)
end

This is not good:

  1. Your function depends on global variables, this should be avoided almost always
  2. It looks like your pulse_injection function mutates its input arguments, conventionally this is denoted in Julia by appending a bang to the function name (pulse_injection!)
  3. You fix the type of c_model to be Array{Float64} in your zeros call. You want zeros(eltype(z), length(t), length(x)) if you are using this array in autodiff code, as it will need to accomodate dual numbers:
julia> using ForwardDiff

julia> x = zeros(3)
3-element Vector{Float64}:
 0.0
 0.0
 0.0

julia> x[1] = ForwardDiff.Dual(1)
ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{Nothing, Int64, 0})
2 Likes

Thanks! That was fast. Sorry. I have more questions. In relation to your topics:

  1. I guess keeping them as normal variables should be no problem, right? My concern is that I want the parameters to be available to the loss_function without calling them explicitly. Would suggest a different approach.
  2. True. Thanks for the tips/
  3. You suggest zeros(eltype(z), length(t), length(x)). What would z be in this case, or in general?

I’m not sure what you mean by “normal variables” here. The key is that all variables in your function are actually arguments to that function. If you need a one-arg version of your function for optimization, the usual pattern is to create a closure like

loss(a, b, c) = a*b^c
b_val = 1; c_val = 2
optimize(z -> loss(z, b_val, c_val), ...)

z is the input to your loss function which you are optimizing over and which therefore will become a Dual during autodiff.

2 Likes

Thanks a lot @nilshg

I could make it work by creating the mutated variable inside the function with the parameters signature as you have mentioned above:

function loss_julia(p, x, t, c0, c_in, q, De, t_pulse)
    phi = p[1]
    alpha_l = p[2]
    vp = q / phi
    Dl = De + alpha_l * vp
    c_model = zeros(eltype(p),length(t), length(x))
    pulse_injection!(c_model, x, t, c0, c_in, vp, Dl, t_pulse)
    return sum((@view(c_model[:, 1]) - c).^2)
end
1 Like