ModelingToolkit/ ODESystem in Turing

Hi all,

Just been playing around with the DifferentialEquation and ModelingToolkit systems within Turing and spotted the behaviour below.

Setting up the system and generating synthetic data:

 using ModelingToolkit, DifferentialEquations

@parameters t
@variables y1(t) y2(t)
@parameters p1 p2

D = Differential(t)

# Define the system of equations
eqs = [
    D(y1) ~ y2,
    D(y2) ~ -p1 * y2 - p2 * y1
]

# Create an ODESystem
@named sys = ODESystem(eqs, t, [y1, y2], [p1, p2])

# Initial conditions
u0 = [y1 => 1.0, y2 => 0.0]

# Parameter values
p = [p1 => 1.0, p2 => 2.0]

# Time span for the solution
tspan = (0.0, 10.0)

# Convert to standard ODEProblem
sys_simplified = structural_simplify(sys)
prob = ODEProblem(sys_simplified, u0, tspan, p)

# Solve the problem
sol_orig = solve(prob)

# Synthetic data generation (for demonstration)
sim_data = sol_orig[1,:] + 0.1 * randn(length(sol_orig.t))

I then integrate this into Turing:

using Turing

# Define the model for Turing
@model function fit_model(data, prob)
    # Priors for the parameters
    p1 ~ Normal(1.0, 0.5)
    p2 ~ Normal(2.0, 0.5)

    # Solve the ODE system with the current parameter values
    prob = remake(prob, p=[p1, p2]) # p = [p2, p1] ?
    predicted = solve(prob, saveat=0.1)

    # Likelihood function
    for i in 1:length(data)
        data[i] ~ Normal(predicted[i][1], 0.1)
    end
end

# Run the model
model = fit_model(sim_data, prob)
chain_prep = sample(model, NUTS(), MCMCThreads(), 5, 2)
chain = sample(model, NUTS(), MCMCThreads(), 2000, 2)

This gives the p1 and p2 as 4.04 and 2.018 respectively, and lead to the following solutions, which is clearly different to that in the original set

# mean(chain_2[:p1])

# Initial conditions
u0 = [y1 => 1.0, y2 => 0.0]

# Parameter values
p_ch1 = [p1 => mean(chain[:p1]), p2 => mean(chain[:p2])]

p = p_ch1

# Time span for the solution
tspan = (0.0, 10.0)

# Convert to standard ODEProblem
sys_simplified = structural_simplify(sys)
prob = ODEProblem(sys_simplified, u0, tspan, p)

# Solve the problem
sol = solve(prob)

Plots.plot(sol[1, :], title = "chain_1", xlimits = ((0, 25)))
Plots.plot!(sol[2, :])
Plots.plot!(sol_orig[1, :], label = "orig y1")
Plots.plot!(sol_orig[2, :], label = "orig y2")

However, if I swap p = [p1, p2] with p = [p2, p1] in the Turing code block above, then the solution is much closer to what it should have been.

Am I missing something here? Why would I need to swap the positions of p1 and p2? I’m using ModelingToolkit v9.15.0

Don’t assume an ordering on the parameters. Use the symbolic tooling as described in the docs. This page goes into some detail:

Effectively you want to:

prob = remake(prob, p=[p1 => p1, p2 => p2])

Then to optimize it more, use setu as described in the docs.

Thanks Chris. When you said use setu to optimise - not quite sure what you mean? The link only shows a short paragraph on setu/setp, and the only other reference in the same site gives this example copied below.

Can you elaborate a bit more on this setu syntax within the Turing use case or point me to some examples/sites please?

using SymbolicIndexingInterface

prob = ODEProblem(sys, [], [p1 => p[1], p2 => p[2]])
setter! = setp(sys, [p1, p2])
function loss(p)
    setter!(prob, p)
    sol = solve(prob, Tsit5())
    sum(abs2, sol)
end

I tried modifying my original code above to this:

using Turing

setter_p! = setp(sys_simplified, [p1, p2]) # ADDED

# Define the model for Turing
@model function fit_model_opt(data, ode_model)

    p_model = ode_model

    # initial conditions
    response_1 ~ Normal(1.0, 0.5)
    response_2 ~ Normal(0.0, 0.5)

    # Priors for the parameters
    param_1 ~ Normal(1.0, 0.5)
    param_2 ~ Normal(2.0, 0.5)
    
    u0_est = [y1, y2]
    param_est = [param_1, param_2]

    setter_p!(p_model, p = param_est) # ADDED

    predicted = solve(p_model, saveat=0.1)

    # Likelihood function
    for i in 1:length(data)
        data[i] ~ Normal(predicted[i][1], 0.1)
    end
end

# Run the model
model_opt = fit_model_opt(sim_data, prob)
chain_prep = sample(model_opt, NUTS(), MCMCThreads(), 5, 2)

But then I got the error TaskFailedException:

nested task error: MethodError: no method matching (::SymbolicIndexingInterface.ParameterHookWrapper{SymbolicIndexingInterface.MultipleSetters{Vector{SymbolicIndexingInterface.SetParameterIndex{ModelingToolkit.ParameterIndex{SciMLStructures.Tunable, Tuple{Int64, Int64}}}}}, Vector{Num}})(::ODEProblem{Vector{Float64}, Tuple{Float64, Float64}, true, ModelingToolkit.MTKParameters{Tuple{Vector{Float64}}, Tuple{}, Tuple{}, Tuple{}, Tuple{}, Nothing, Nothing}, ODEFunction{true, SciMLBase.AutoSpecialize, ModelingToolkit.var"#f#711"{RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:ËŤâ‚‹arg1, :ËŤâ‚‹arg2, :t), ModelingToolkit.var"#_RGF_ModTag", ModelingToolkit.var"#_RGF_ModTag", (0x286d8247, 0xf8d5e03e, 0x1a7ae01c, 0x25d25e8f, 0xef93d4ad), Nothing}, RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:ËŤâ‚‹out, :ËŤâ‚‹arg1, :ËŤâ‚‹arg2, :t), ModelingToolkit.var"#_RGF_ModTag", ModelingToolkit.var"#_RGF_ModTag", (0x35e05d12, 0x3ee3d309, 0x67674753, 0x9b880f2d, 0x3a080b39), Nothing}}, UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, ModelingToolkit.var"#863#generated_observed#720"{Bool, ODESystem, Dict{Any, Any}, Vector{Any}}, Nothing, ODESystem, Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, @NamedTuple{}}, SciMLBase.StandardODEProblem}; p::Vector{Float64})

Incidentally, I also rewrote the ODE and wrapped it in a function instead. The speed seems comparable but @time shows it uses fewer allocations and ram. Any thoughts on the Pros/Cons of this approach instead?

function de_func_2!(du, u, k, t)
    y1, y2 = u
    p1, p2 = k
  
    # Define the ODE
    du[1] = y2
    du[2] = -p1 * y2 - p2 * y1
end

u0 = [1.0, 0.0]
p = [1.0, 2.0]
tspan = (0.0, 10.0)

prob_f2 = ODEProblem(de_func_2!, u0, tspan, p)
@model function fit_model_3(data)

    # initial conditions
    y1 ~ Normal(1.0, 0.5)
    y2 ~ Normal(0.0, 0.5)

    # Priors for the parameters
    p1 ~ Normal(1.0, 0.5)
    p2 ~ Normal(2.0, 0.5)

    u = [y1, y2]
    k = [p1, p2]

    # Solve the ODE system with the current parameter values
    prob_func = remake(prob_f2, u0 = u, p = k)
    predicted = solve(prob_func, saveat=0.1)

    # Likelihood function
    for i in 1:length(data)
        data[i] ~ Normal(predicted[i][1], 0.1)
    end
end

model_3 = fit_model_3(sim_data)
chain_prep = sample(model_3, NUTS(), MCMCThreads(), 2000, 2)
# 5.249200 seconds (171.07 M allocations: 17.648 GiB, 25.26% gc time)

…compared to the ModelingToolkit with the @variables etc approach:

# 7.099144 seconds (189.25 M allocations: 27.969 GiB, 28.48% gc time)

@cryptic.ax is this another case that needs the parameter buffer thing? How’s that coming along?

This should be setter_p!(p_model, param_est).

Is your ModelingToolkit time using remake? Because the setp method should help with the allocations.

I tried that setter_p!(p_model, param_est) but it gives me the same error.

As a test, if I replace that with setter_p!(p_model, [1.0, 2.0]) then it runs. So it would appear this is down to the Type of the param?

And yes - that 7.099 seconds run was wit the remake as I couldn’t get the setter_p! syntax to work.

is eltype(param_est) not Float64? That makes it a little more difficult. Can you create an ODEProblem where the parameter values have the type eltype(param_est)?. If not, I would recommend looking into PreallocationTools.jl and use their GeneralLazyBufferCache to cache the problem based on eltype(param_est). I did reply to another discourse post using this same methodology, so I’ll see if I can find that.

This is that discourse post. The method at the end is what you should use. Feel free to ask if anything is not clear.

EDIT: They’re using Turing.jl too, so it should map 1-1 with your use case.

It is Float64, but it’s also saying it’s ForwardDiff.Dual{ForwardDiff.Tag{Turing.TuringTag, Float64}, Float64, 4}
when I asked it to display(eltype(param_est))

Right, so it also uses ForwardDiff for AD. setp is only applicable when the types of parameters don’t change. Since the types change here, you need to use remake. To avoid repeatedly calling remake, PreallocationTools can cache the different ODEProblems that you get from remake depending on the type of p.

I’ve followed that thread and modified as follows - can you let me know if that’s the most efficient way please?

syms = [sys_simplified.p1, sys_simplified.p2]

# cache for remake(prob, p=newp)
lbc = GeneralLazyBufferCache(
    function (p)
        remake_buffer(sys_simplified, prob.p, Dict(zip(syms, p)))
end)

setter_p! = setp(sys_simplified, syms) 

Then in Turing:

using Turing

# Define the model for Turing
@model function fit_model_opt(data, ode_model, lbc)

    p_model = ode_model

    # initial conditions
    response_1 ~ Normal(1.0, 0.5)
    response_2 ~ Normal(0.0, 0.5)

    # Priors for the parameters
    param_1 ~ Normal(1.0, 0.5)
    param_2 ~ Normal(2.0, 0.5)
    
    u0_est = [y1, y2]
    param_est = [param_1; param_2]
    new_p = lbc[param_est]

    setter_p(new_p, param_est)
    new_prob = remake(p_model, p = new_p)

    predicted = solve(new_prob, saveat=0.1)

    # Likelihood function
    for i in 1:length(data)
        data[i] ~ Normal(predicted[i][1], 0.1)
    end
end

# Run the model
model_opt = fit_model_opt(sim_data, prob, lbc)
chain_prep = sample(model_opt, NUTS(), MCMCThreads(), 2000, 2)

In terms of runtime:

# 7.099144 seconds (189.25 M allocations: 27.969 GiB, 28.48% gc time) # original
# 5.249200 seconds (171.07 M allocations: 17.648 GiB, 25.26% gc time) # with the function de_func_2!
# 4.270760 seconds (106.78 M allocations: 16.025 GiB, 28.01% gc time) # above

EDIT: forgot to include u0/ response_1/2 in that remake step when that 4.27 sec run was done. Once I include that back it’s back up to

# 6.954024 seconds (173.44 M allocations: 26.144 GiB, 31.45% gc time)

That’s perfect. It looks pretty fast too.

I’ll add a tutorial for this sort of workflow in MTK too.

Also, you should note that using setter_p inside the function without passing it (or it being const) will cause slowdowns, since Julia will not be able to infer the type of setter_p.

You mean like @model function fit_model_opt(data, ode_model, lbc, setter_p)

and call function like model_opt = fit_model_opt(sim_data, prob, lbc, setter_p)?

And is there an equivalent function/ approach for the u0_est parameter/initial conditions too? I replicated for u0:

syms_u = [sys_simplified.y1, sys_simplified.y2]

# cache for remake(prob, p=newp)
lbc_u = GeneralLazyBufferCache(
    function (u)
        remake_buffer(sys_simplified, prob.u0, Dict(zip(syms_u, u)))
end)

setter_u = setu(sys_simplified, syms_u)

and added to the Turing bit:

u0_est = [response_1; response_2]
new_u = lbc_u[u0_est]
set_u(new_u, u0_est)
set_p(new_p, param_est)
new_prob = remake(p_model, u0 = new_u, p = new_p)

The above ran, but now with the remake including the state_variables the gain in speed and allocation is marginal:

# 7.099144 seconds (189.25 M allocations: 27.969 GiB, 28.48% gc time) # original
# 5.249200 seconds (171.07 M allocations: 17.648 GiB, 25.26% gc time) # using function de_func_2!
# 6.279233 seconds (170.90 M allocations: 25.862 GiB, 29.44% gc time) # above

Is there any reason I should use the above over the function de_func_2! and remake approach?

Yeah, that looks good.

u will always be a fixed-size array. For such cases, DiffEqCache is the better option, since it’s type-inferred and designed specifically for this purpose. See this example.

So I added dc_u = DiffCache(prob.u0) after the setter_p = setp(sys_simplified, syms_p) line

and then in the Turing block, modified to show:

u0_est = [response_1; response_2]
new_u = get_tmp(dc_u, u0_est)

param_est = [param_1; param_2]
new_p = lbc_p[param_est]

new_prob = remake(p_model, u0 = new_u, p = new_p)

That ran super quick:

# 3.048528 seconds (87.84 M allocations: 13.391 GiB, 32.24% gc time) # chain_opt + dc_u

BUT - somehow on any subsequent run of the Turing sampling

@time chain_opt = sample(model_opt, NUTS(), MCMCThreads(), 2000, 2)

It seems to always takes A LOT longer despite reaching 50% after 3 seconds, taking a total of - e.g.

360.726003 seconds (7.62 G allocations: 776.021 GiB, 15.21% gc time)

This is despite me restarting the laptop/ VSCode etc. If I take out the DiffCache code, then the runs become consistent again.

Do I need to manual clear out any cached variables after each run or something similar…? Most odd.

On the first run I did get following warning though:

Warning: The supplied DiffCache was too small and was enlarged. This incurs allocations
│     on the first call to `get_tmp`. If few calls to `get_tmp` occur and optimal performance is essential,
│     consider changing 'N'/chunk size of this DiffCache to 4.

EDIT: Crammed in

dc_u = nothing
dc_u = DiffCache(prob.u0, 4)

before each of the Turing sampling step seems to have gotten rid of that slow-down issue.

# 6.279233 seconds (170.90 M allocations: 25.862 GiB, 29.44% gc time) # with  LazyBufferCache on the p only
# 3.048528 seconds (87.84 M allocations: 13.391 GiB, 32.24% gc time) # with the DiffCache on the u + LazyBufferCache on the p 
1 Like