Stuck differentiating through multidimensional, potentially large and stiff ODE solution

Hi,

I am solving 1D advective dispersive and reactive transport models as Ordinary Differential Equations. To speed things up, I am using Symbolics to detect the sparsity pattern or modelingtoolkitize to reduce allocations. The model construction look something like this:

function make_ode(v, De, c_in, nmob, dx)
    function ode!(du, u, p, t)
        r1, r2 = p
        c_advec = [c_in;u[:,1:nmob]]
        advec = -v .* diff(c_advec, dims=1) ./ dx
        gradc = diff(u[:,1:nmob], dims=1)./dx
        disp = ([gradc; zeros(1, nmob)]-[zeros(1, nmob); gradc]).* De ./ dx
        rate1 = r1 .* u[:,1]
        rate2 = r2 .* u[:,3]
        du .= advec .+ disp
        du[:,1] .-= rate1
        du[:,2] .+= rate1
        du[:,3] .-= rate2
    end
end

Where I potentially need to find the values of the parameters (r1,r2) through model fitting.

I have been using these different strategies to speed the models up:

  • Original problem
prob = ODEProblem(ode!, u0, tspan, p)
@benchmark sol = solve($prob, QNDF(), saveat=1800, abstol = 1e-8, reltol = 1e-8)
# 67.55 ms (55143 allocations: 60.83 MiB)
  • Symbolic sparsity detection
du0 = similar(u0)
jac_sparsity = Symbolics.jacobian_sparsity((du, u) -> ode!(du, u, p, 1),
    du0, u0) # add the sparsity pattern to speed up the solution
f! = ODEFunction(ode!, jac_prototype=jac_sparsity)
fastprob = ODEProblem(f!, u0, tspan, p)
@benchmark sol = solve($fastprob, QNDF(), saveat=1800, abstol = 1e-8, reltol = 1e-8)
# 12.89 ms (39418 allocations: 37.36 MiB)
  • Modeling Toolkit’s modelingtoolkitize:
de = complete(modelingtoolkitize(prob))
mtkprob = ODEProblem(de, [], tspan, jac=true, sparse=true)
# For some reason modelling toolkit reshape my output,
# but if I remake with u0 in the original shape it works correctly.
mtkprob = remake(mtkprob, u0 = u0)
@benchmark sol = solve($mtkprob, QNDF(), saveat=1800, abstol = 1e-8, reltol = 1e-8)
# 8.98 ms (2798 allocations: 11.62 MiB)

And then I use the data collected at the outflow to fit the model parameters:

function loss(u, p)
    prob, data, t = p
    loss_prob = remake(prob, p = u)
    sol = solve(loss_prob, QNDF(), saveat=t, abstol = 1e-8, reltol = 1e-8,
    sensealg = SensitivityADPassThrough())
    return sum([sum(abs2, (sol.u[i][end,:] - data[i, :])) for i in eachindex(t)])
end

I Have a few concerns when I try to calculate the derivative of the loss function, I was hoping to get some advice here.

  1. Calculating gradients with different solver configurations.

if I use my original ODEProblem:prob, ForwardDiff works:

loss(p, [prob, data, t])
closure(q) = p -> loss(p, q)
backend = AutoForwardDiff()
prep = prepare_gradient(closure([prob, data, t]), backend, p)
grad = gradient(closure([prob, data, t]), prep, backend, p)

But if use the version with the sparsity pattern (fastprob) or the mtk reconstructtion (mtkprob), then it doesn’t work anymore:

prep = prepare_gradient(closure([mtkprob, data, t]), backend, p)
grad = gradient(closure([mtkprob, data, t]), prep, backend, p)

And I get this error, even after trying different sensalg parameters.

ERROR: MethodError: no method matching KLU.KLUFactorization(::SparseArrays.SparseMatrixCSC{ForwardDiff.Dual{ForwardDiff.Tag{…}, Float64, 2}, Int64})
  1. The solution seems to be sensitive to numerical finite differences, if I dont set relstep I have incorrect results e.g.: backend = AutoFiniteDiff(fdtype =Val(:central); relstep=1e-10)

Anyway, I was wondering if you can reccomend me what are the up-to-date procedures for differentiating through these ode solutions. Should I measure the tradeoff between a faster solution through finite differences or slower differentiable solution?

Thanks!!

1 Like