ModelingToolkit: algebraic equations in likelihood and reverse-mode AD

Brian_Hartley · February 7, 2023, 1:01pm

TLDR - Is it possible to use both state / observed variables from ODESolution in a likelihood, without concatenating them ex post or using mutation?

Maybe there is something dead-simple I’m missing!

I’m starting to convert some models in continuous time into the Julia SciML ecosystem (which is fantastic). For my application it is often convenient to describe the system’s behavioral equations using both differential and algebraic equations, with estimated model parameters contained in both, plus some identities/constraints. So the system is a DAE (medium/large), but the likelihood relies on both “state” and “observed” variables in the ModelingToolkit.jl terminology.

This runs me into the standard mutation problem with Zygote. I am practicing tearing / simplifying the system, then concatonating these variables after the integration step to compute the likelihood, but this seems like a poor idea since this involves mutation and forces reliance only on ForwardDiff (undesirable for large P). I suppose I could take total time derivatives of all the observed algebraic equations and use only the expanded state variables in the likelihood (and reverse-mode AD). Is there a simple way to avoid this and to access both the state/observed variables from the ODESolution in a way compatible with reverse-mode AD?

My apologies if this simply relitigates the well-known issue of mutation with Zygote, that is not my intent (though I have tried to avoid this using Buffer unsuccessfully). I hoped that this was a use-case were an easier solution was available and that I am misunderstanding some basic package functionality regarding differentiability.

Here is a simple system illustrating the point:

using ModelingToolkit, DifferentialEquations
using Distributions, Optimization

# define symbolic system 
@parameters  α, ρ, β, θ
@variables t, u₁(t), u₂(t), u₃(t), z(t)
D = Differential(t)
eqs = [D(u₁) ~ α*(u₂-u₁),
    D(u₂) ~ u₁ * (ρ - u₃) - u₂, 
    D(u₃) ~ u₁ * u₂ - β*z,
    z ~ θ*u₃]  #   <-- also have data for this, absence from likelihood generates weak identification 
@named lorenz1 = ODESystem(eqs)

# weak identification : β vs. θ₁ 
simple_sys = structural_simplify(lorenz1)

p = [10, 28 , 1/6, 2]
tspan = (0.0,3.0)
tseq = collect(tspan[1]:0.1:tspan[2])
u0 = [1.0,0.0,0.0]
prob = ODEProblem(simple_sys,u0,tspan,p;)
sol = solve(prob,Tsit5(),saveat=collect(0.0:0.1:3))

# Generate data including z 
# concatonate obs + state 
obs_symb = states(lorenz1)
traj = sol[obs_symb]
N, T = (length(obs_symb),length(t_seq))
data_cat = [traj[t][i] + 2.0 * randn() for i in 1:N, t in 1:T]
p0 = vcat(prob.p,fill(2.2,N)) # init param 

function loglik(p)

    # re index param vector 
    θ = eltype(p).(p[1:length(prob.p)])
    sigma = eltype(p).(p[(length(prob.p)+1):(N+length(prob.p))])

    # solve simp/torn problem 
    tmp_prob = remake(prob,u0=eltype(θ).(prob.u0),p=θ)
    tmp_sol = solve(tmp_prob, Tsit5(), saveat=0.1)

    # likelihood 
    loglik =Vector{Real}(undef,T)
    for t ∈ 1:T # 
        loglik[t] = logpdf(MvNormal(sol[obs_symb][:][t],sigma), data_cat[:,t]) # allocating; inefficient 
    end

    return(-sum(loglik)) # negative
end 

# test AD 
using ForwardDiff, Zygote
using SciMLSensitivity

loglik(p0)
fd_grad = ForwardDiff.gradient(loglik,p0) # works fine 
fd_grad = Zygote.gradient(loglik,p0) # mutation!

ChrisRackauckas · February 7, 2023, 7:36pm

Not right now, it’s an open issue that will need a derivative definition

github.com/SciML/SciMLSensitivity.jl

Adjoints of observed variables

opened 07:39PM - 27 Dec 22 UTC

iliailmer

I was trying to adapt [this example from tutorials](https://docs.sciml.ai/SciMLS…ensitivity/dev/ode_fitting/optimization_ode/) so that I can use a more customized data sample. For example, if I have states `x1, x2, x3`, I collect the following sample: ```julia mq = [y1~x1, y2~x2+x3] ``` then I am trying to use loss as ```julia function loss(p) sol = solve(prob, Tsit5(), p = p, saveat = tsteps) data_true = [data_sample[v.rhs] for v in mq] data = [sol[v.rhs] for v in mq] # this is where the code fails loss = sum(sum((data[i] .- data_true[i]) .^ 2 for i in eachindex(data))) return loss, sol end ``` but I am getting ` ArgumentError: invalid index: x1(t) of type Term{Real, Base.ImmutableDict{DataType, Any}}` at the line ```julia data = [sol[v.rhs] for v in mq] ``` What can be done to fix this?

Brian_Hartley · February 7, 2023, 8:30pm

Ok duly noted, thanks for the reply.

Topic		Replies	Views
ModelingToolkit/ ODESystem in Turing Probabilistic programming turing , modelingtoolkit	15	366	June 18, 2024
Generating derived variables from ModelingToolkit.jl - variable of interest eliminated during structural_simplify Modelling & Simulations	8	646	February 2, 2023
Using Modeling Toolkit to solve simple equation Modelling & Simulations modelingtoolkit	2	290	May 15, 2024
ModelingToolkit.jl -- intermediate symbolic variables calculated from solutions output in the state vector General Usage modelingtoolkit	5	982	May 6, 2024
[ANN] DataDrivenDiffEq.jl - Overhauled in 0.6.0 Package Announcements symbolic , sciml , modelingtoolkit	0	607	June 1, 2021

ModelingToolkit: algebraic equations in likelihood and reverse-mode AD

Related topics