# ModelingToolkit/ ODESystem in Turing

Hi all,

Just been playing around with the DifferentialEquation and ModelingToolkit systems within Turing and spotted the behaviour below.

Setting up the system and generating synthetic data:

`````` using ModelingToolkit, DifferentialEquations

@parameters t
@variables y1(t) y2(t)
@parameters p1 p2

D = Differential(t)

# Define the system of equations
eqs = [
D(y1) ~ y2,
D(y2) ~ -p1 * y2 - p2 * y1
]

# Create an ODESystem
@named sys = ODESystem(eqs, t, [y1, y2], [p1, p2])

# Initial conditions
u0 = [y1 => 1.0, y2 => 0.0]

# Parameter values
p = [p1 => 1.0, p2 => 2.0]

# Time span for the solution
tspan = (0.0, 10.0)

# Convert to standard ODEProblem
sys_simplified = structural_simplify(sys)
prob = ODEProblem(sys_simplified, u0, tspan, p)

# Solve the problem
sol_orig = solve(prob)

# Synthetic data generation (for demonstration)
sim_data = sol_orig[1,:] + 0.1 * randn(length(sol_orig.t))
``````

I then integrate this into Turing:

``````using Turing

# Define the model for Turing
@model function fit_model(data, prob)
# Priors for the parameters
p1 ~ Normal(1.0, 0.5)
p2 ~ Normal(2.0, 0.5)

# Solve the ODE system with the current parameter values
prob = remake(prob, p=[p1, p2]) # p = [p2, p1] ?
predicted = solve(prob, saveat=0.1)

# Likelihood function
for i in 1:length(data)
data[i] ~ Normal(predicted[i][1], 0.1)
end
end

# Run the model
model = fit_model(sim_data, prob)
chain_prep = sample(model, NUTS(), MCMCThreads(), 5, 2)
chain = sample(model, NUTS(), MCMCThreads(), 2000, 2)
``````

This gives the `p1` and `p2` as `4.04` and `2.018` respectively, and lead to the following solutions, which is clearly different to that in the original set

``````# mean(chain_2[:p1])

# Initial conditions
u0 = [y1 => 1.0, y2 => 0.0]

# Parameter values
p_ch1 = [p1 => mean(chain[:p1]), p2 => mean(chain[:p2])]

p = p_ch1

# Time span for the solution
tspan = (0.0, 10.0)

# Convert to standard ODEProblem
sys_simplified = structural_simplify(sys)
prob = ODEProblem(sys_simplified, u0, tspan, p)

# Solve the problem
sol = solve(prob)

Plots.plot(sol[1, :], title = "chain_1", xlimits = ((0, 25)))
Plots.plot!(sol[2, :])
Plots.plot!(sol_orig[1, :], label = "orig y1")
Plots.plot!(sol_orig[2, :], label = "orig y2")
``````

However, if I swap `p = [p1, p2]` with `p = [p2, p1]` in the Turing code block above, then the solution is much closer to what it should have been.

Am I missing something here? Why would I need to swap the positions of `p1` and `p2`? Iâ€™m using `ModelingToolkit v9.15.0`

Donâ€™t assume an ordering on the parameters. Use the symbolic tooling as described in the docs. This page goes into some detail:

Effectively you want to:

``````prob = remake(prob, p=[p1 => p1, p2 => p2])
``````

Then to optimize it more, use `setu` as described in the docs.

Thanks Chris. When you said use `setu` to optimise - not quite sure what you mean? The link only shows a short paragraph on setu/setp, and the only other reference in the same site gives this example copied below.

Can you elaborate a bit more on this `setu` syntax within the `Turing` use case or point me to some examples/sites please?

``````using SymbolicIndexingInterface

prob = ODEProblem(sys, [], [p1 => p[1], p2 => p[2]])
setter! = setp(sys, [p1, p2])
function loss(p)
setter!(prob, p)
sol = solve(prob, Tsit5())
sum(abs2, sol)
end
``````

I tried modifying my original code above to this:

``````using Turing

setter_p! = setp(sys_simplified, [p1, p2]) # ADDED

# Define the model for Turing
@model function fit_model_opt(data, ode_model)

p_model = ode_model

# initial conditions
response_1 ~ Normal(1.0, 0.5)
response_2 ~ Normal(0.0, 0.5)

# Priors for the parameters
param_1 ~ Normal(1.0, 0.5)
param_2 ~ Normal(2.0, 0.5)

u0_est = [y1, y2]
param_est = [param_1, param_2]

setter_p!(p_model, p = param_est) # ADDED

predicted = solve(p_model, saveat=0.1)

# Likelihood function
for i in 1:length(data)
data[i] ~ Normal(predicted[i][1], 0.1)
end
end

# Run the model
model_opt = fit_model_opt(sim_data, prob)
chain_prep = sample(model_opt, NUTS(), MCMCThreads(), 5, 2)
``````

But then I got the error `TaskFailedException`:

``````nested task error: MethodError: no method matching (::SymbolicIndexingInterface.ParameterHookWrapper{SymbolicIndexingInterface.MultipleSetters{Vector{SymbolicIndexingInterface.SetParameterIndex{ModelingToolkit.ParameterIndex{SciMLStructures.Tunable, Tuple{Int64, Int64}}}}}, Vector{Num}})(::ODEProblem{Vector{Float64}, Tuple{Float64, Float64}, true, ModelingToolkit.MTKParameters{Tuple{Vector{Float64}}, Tuple{}, Tuple{}, Tuple{}, Tuple{}, Nothing, Nothing}, ODEFunction{true, SciMLBase.AutoSpecialize, ModelingToolkit.var"#f#711"{RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:ËŤâ‚‹arg1, :ËŤâ‚‹arg2, :t), ModelingToolkit.var"#_RGF_ModTag", ModelingToolkit.var"#_RGF_ModTag", (0x286d8247, 0xf8d5e03e, 0x1a7ae01c, 0x25d25e8f, 0xef93d4ad), Nothing}, RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:ËŤâ‚‹out, :ËŤâ‚‹arg1, :ËŤâ‚‹arg2, :t), ModelingToolkit.var"#_RGF_ModTag", ModelingToolkit.var"#_RGF_ModTag", (0x35e05d12, 0x3ee3d309, 0x67674753, 0x9b880f2d, 0x3a080b39), Nothing}}, UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, ModelingToolkit.var"#863#generated_observed#720"{Bool, ODESystem, Dict{Any, Any}, Vector{Any}}, Nothing, ODESystem, Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, @NamedTuple{}}, SciMLBase.StandardODEProblem}; p::Vector{Float64})
``````

Incidentally, I also rewrote the ODE and wrapped it in a function instead. The speed seems comparable but `@time` shows it uses fewer allocations and ram. Any thoughts on the Pros/Cons of this approach instead?

``````function de_func_2!(du, u, k, t)
y1, y2 = u
p1, p2 = k

# Define the ODE
du[1] = y2
du[2] = -p1 * y2 - p2 * y1
end

u0 = [1.0, 0.0]
p = [1.0, 2.0]
tspan = (0.0, 10.0)

prob_f2 = ODEProblem(de_func_2!, u0, tspan, p)
``````
``````@model function fit_model_3(data)

# initial conditions
y1 ~ Normal(1.0, 0.5)
y2 ~ Normal(0.0, 0.5)

# Priors for the parameters
p1 ~ Normal(1.0, 0.5)
p2 ~ Normal(2.0, 0.5)

u = [y1, y2]
k = [p1, p2]

# Solve the ODE system with the current parameter values
prob_func = remake(prob_f2, u0 = u, p = k)
predicted = solve(prob_func, saveat=0.1)

# Likelihood function
for i in 1:length(data)
data[i] ~ Normal(predicted[i][1], 0.1)
end
end

model_3 = fit_model_3(sim_data)
chain_prep = sample(model_3, NUTS(), MCMCThreads(), 2000, 2)
``````
``````# 5.249200 seconds (171.07 M allocations: 17.648 GiB, 25.26% gc time)
``````

â€¦compared to the ModelingToolkit with the `@variables` etc approach:

``````# 7.099144 seconds (189.25 M allocations: 27.969 GiB, 28.48% gc time)
``````

@cryptic.ax is this another case that needs the parameter buffer thing? Howâ€™s that coming along?

This should be `setter_p!(p_model, param_est)`.

Is your ModelingToolkit time using `remake`? Because the `setp` method should help with the allocations.

I tried that `setter_p!(p_model, param_est)` but it gives me the same error.

As a test, if I replace that with `setter_p!(p_model, [1.0, 2.0])` then it runs. So it would appear this is down to the Type of the param?

And yes - that `7.099 seconds` run was wit the `remake` as I couldnâ€™t get the `setter_p!` syntax to work.

is `eltype(param_est)` not `Float64`? That makes it a little more difficult. Can you create an `ODEProblem` where the parameter values have the type `eltype(param_est)`?. If not, I would recommend looking into PreallocationTools.jl and use their `GeneralLazyBufferCache` to cache the problem based on `eltype(param_est)`. I did reply to another discourse post using this same methodology, so Iâ€™ll see if I can find that.

This is that discourse post. The method at the end is what you should use. Feel free to ask if anything is not clear.

EDIT: Theyâ€™re using Turing.jl too, so it should map 1-1 with your use case.

It is `Float64`, but itâ€™s also saying itâ€™s `ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 4}`
when I asked it to `display(eltype(param_est))`

Right, so it also uses ForwardDiff for AD. `setp` is only applicable when the types of parameters donâ€™t change. Since the types change here, you need to use `remake`. To avoid repeatedly calling `remake`, PreallocationTools can cache the different `ODEProblem`s that you get from `remake` depending on the type of `p`.

Iâ€™ve followed that thread and modified as follows - can you let me know if thatâ€™s the most efficient way please?

``````syms = [sys_simplified.p1, sys_simplified.p2]

# cache for remake(prob, p=newp)
lbc = GeneralLazyBufferCache(
function (p)
remake_buffer(sys_simplified, prob.p, Dict(zip(syms, p)))
end)

setter_p! = setp(sys_simplified, syms)
``````

Then in Turing:

``````using Turing

# Define the model for Turing
@model function fit_model_opt(data, ode_model, lbc)

p_model = ode_model

# initial conditions
response_1 ~ Normal(1.0, 0.5)
response_2 ~ Normal(0.0, 0.5)

# Priors for the parameters
param_1 ~ Normal(1.0, 0.5)
param_2 ~ Normal(2.0, 0.5)

u0_est = [y1, y2]
param_est = [param_1; param_2]
new_p = lbc[param_est]

setter_p(new_p, param_est)
new_prob = remake(p_model, p = new_p)

predicted = solve(new_prob, saveat=0.1)

# Likelihood function
for i in 1:length(data)
data[i] ~ Normal(predicted[i][1], 0.1)
end
end

# Run the model
model_opt = fit_model_opt(sim_data, prob, lbc)
chain_prep = sample(model_opt, NUTS(), MCMCThreads(), 2000, 2)
``````

In terms of runtime:

``````# 7.099144 seconds (189.25 M allocations: 27.969 GiB, 28.48% gc time) # original
# 5.249200 seconds (171.07 M allocations: 17.648 GiB, 25.26% gc time) # with the function de_func_2!
# 4.270760 seconds (106.78 M allocations: 16.025 GiB, 28.01% gc time) # above
``````

EDIT: forgot to include `u0/ response_1/2` in that remake step when that `4.27 sec` run was done. Once I include that back itâ€™s back up to

``````# 6.954024 seconds (173.44 M allocations: 26.144 GiB, 31.45% gc time)
``````

Thatâ€™s perfect. It looks pretty fast too.

Iâ€™ll add a tutorial for this sort of workflow in MTK too.

Also, you should note that using `setter_p` inside the function without passing it (or it being `const`) will cause slowdowns, since Julia will not be able to infer the type of `setter_p`.

You mean like `@model function fit_model_opt(data, ode_model, lbc, setter_p)`

and call function like `model_opt = fit_model_opt(sim_data, prob, lbc, setter_p)`?

And is there an equivalent function/ approach for the `u0_est` parameter/initial conditions too? I replicated for `u0`:

``````syms_u = [sys_simplified.y1, sys_simplified.y2]

# cache for remake(prob, p=newp)
lbc_u = GeneralLazyBufferCache(
function (u)
remake_buffer(sys_simplified, prob.u0, Dict(zip(syms_u, u)))
end)

setter_u = setu(sys_simplified, syms_u)
``````

and added to the Turing bit:

``````u0_est = [response_1; response_2]
new_u = lbc_u[u0_est]
set_u(new_u, u0_est)
set_p(new_p, param_est)
new_prob = remake(p_model, u0 = new_u, p = new_p)
``````

The above ran, but now with the remake including the `state_variables` the gain in speed and allocation is marginal:

``````# 7.099144 seconds (189.25 M allocations: 27.969 GiB, 28.48% gc time) # original
# 5.249200 seconds (171.07 M allocations: 17.648 GiB, 25.26% gc time) # using function de_func_2!
# 6.279233 seconds (170.90 M allocations: 25.862 GiB, 29.44% gc time) # above
``````

Is there any reason I should use the above over the `function de_func_2!` and `remake` approach?

Yeah, that looks good.

`u` will always be a fixed-size array. For such cases, `DiffEqCache` is the better option, since itâ€™s type-inferred and designed specifically for this purpose. See this example.

So I added `dc_u = DiffCache(prob.u0) ` after the `setter_p = setp(sys_simplified, syms_p) ` line

and then in the Turing block, modified to show:

``````u0_est = [response_1; response_2]
new_u = get_tmp(dc_u, u0_est)

param_est = [param_1; param_2]
new_p = lbc_p[param_est]

new_prob = remake(p_model, u0 = new_u, p = new_p)
``````

That ran super quick:

``````# 3.048528 seconds (87.84 M allocations: 13.391 GiB, 32.24% gc time) # chain_opt + dc_u
``````

BUT - somehow on any subsequent run of the Turing sampling

``````@time chain_opt = sample(model_opt, NUTS(), MCMCThreads(), 2000, 2)
``````

It seems to always takes A LOT longer despite reaching 50% after 3 seconds, taking a total of - e.g.

``````360.726003 seconds (7.62 G allocations: 776.021 GiB, 15.21% gc time)
``````

This is despite me restarting the laptop/ VSCode etc. If I take out the `DiffCache` code, then the runs become consistent again.

Do I need to manual clear out any cached variables after each run or something similarâ€¦? Most odd.

On the first run I did get following warning though:

``````Warning: The supplied DiffCache was too small and was enlarged. This incurs allocations
â”‚     on the first call to `get_tmp`. If few calls to `get_tmp` occur and optimal performance is essential,
â”‚     consider changing 'N'/chunk size of this DiffCache to 4.
``````

EDIT: Crammed in

``````dc_u = nothing
dc_u = DiffCache(prob.u0, 4)
``````

before each of the Turing sampling step seems to have gotten rid of that slow-down issue.

``````# 6.279233 seconds (170.90 M allocations: 25.862 GiB, 29.44% gc time) # with  LazyBufferCache on the p only
# 3.048528 seconds (87.84 M allocations: 13.391 GiB, 32.24% gc time) # with the DiffCache on the u + LazyBufferCache on the p
``````
1 Like