Adding uncertainty on timestep of Bayesian ODE solve through `saveat`

I’d like to do a Bayesian inversion of parameters to an ODE model. I would like to add some uncertainty on the input time series. Is there any way to do this? I’ve tried to add a parameter tau ~ Uniform(-60, 60) and used it in the saveat = times .+ tau part of the ODE specification, but this gets me in trouble.

Here’s a slightly big MWE based on the Lotka-Volterra example code. First I rewrote the example to use the new function(x) | (y = data) syntax to see if I understood that correctly. (I had to add a if !ismissing(x) block around stuff that relied on x for the calculation to allow me to sample from the prior).

My first error was the forwarddiff dual cache size issue I’ve run into a few times now, while the second was a dimension mismatch (might be the same?)

I had first posted this on the Turing slack but someone referred me to the DifferentialEquations side of things.

using Turing
using DifferentialEquations
using FillArrays

# Load StatsPlots for visualizations and diagnostics.
using StatsPlots

using LinearAlgebra

# Set a seed for reproducibility.
using Random
Random.seed!(14);

# Define Lotka-Volterra model.
function lotka_volterra(du, u, p, t)
    # Model parameters.
    α, β, γ, δ = p
    # Current state.
    x, y = u

    # Evaluate differential equations.
    du[1] = (α - β * y) * x # prey
    du[2] = (δ * x - γ) * y # predator

    return nothing
end

# Define initial-value problem.
u0 = [1.0, 1.0]
p = [1.5, 1.0, 3.0, 1.0]
tspan = (0.0, -20.0) # slightly larger than extrema(times) because of tau
prob = ODEProblem(lotka_volterra, u0, tspan, p)

# # Plot simulation.
plot(solve(prob, Tsit5()))

# in my use-case time goes backwards, wanted to ensure this isn't
# the cause of the problem
times = 0:(-0.1):(-10)
sol = solve(prob, Tsit5(); saveat=times)

odedata = Array(sol) + 0.8 * randn(size(Array(sol)))

# # Plot simulation and noisy observations.
# plot(sol; alpha=0.3)
# scatter!(sol.t, odedata'; color=[1 2], label="")

# used the second version, which only has predator data
@model function fitlv(data::AbstractVector, times, prob)
    # Prior distributions.
    σ ~ InverseGamma(2, 3)
    α ~ truncated(Normal(1.5, 0.5); lower=0.5, upper=2.5)
    β ~ truncated(Normal(1.2, 0.5); lower=0, upper=2)
    γ ~ truncated(Normal(3.0, 0.5); lower=1, upper=4)
    δ ~ truncated(Normal(1.0, 0.5); lower=0, upper=2)

    # Simulate Lotka-Volterra model.
    p = [α, β, γ, δ]

    # make sure tspan is large enough
    newprob = remake(prob; p = p, tspan = [0; minimum(times)])

    predicted = solve(newprob, Tsit5();
                      # seems like I cannot adjust tspan here?
                      # my particular solution needs t = 0 to be included
                      saveat = [0; times],
                      # simulate full system but save only second state
                      save_idxs = 2)

    # Observations
    data ~ MvNormal(predicted.u, σ^2 * I)

    return nothing
end

model = fitlv(odedata[2,:], times, prob)

# Sample 3 independent chains with forward-mode automatic differentiation (the default).
chain = sample(model, NUTS(), MCMCSerial(), 1000, 3; progress=false)

plot(chain)

plot(; legend=false)
posterior_samples = sample(chain[[:α, :β, :γ, :δ]], 300; replace=false)
for p in eachrow(Array(posterior_samples))
    sol_p = solve(prob, Tsit5(); p=p, saveat=times)
    plot!(sol_p; alpha=0.1, color="#BBBBBB")
end

# Plot simulation and noisy observations.
plot!(sol; color=[1 2], linewidth=1)
scatter!(sol.t, odedata'; color=[1 2])

# create the same model but using the new conditioning | data syntax
@model function fitlv2(times, prob)
    # Prior distributions.
    σ ~ InverseGamma(2, 3)
    α ~ truncated(Normal(1.5, 0.5); lower=0.5, upper=2.5)
    β ~ truncated(Normal(1.2, 0.5); lower=0, upper=2)
    γ ~ truncated(Normal(3.0, 0.5); lower=1, upper=4)
    δ ~ truncated(Normal(1.0, 0.5); lower=0, upper=2)

    # Simulate Lotka-Volterra model.
    p = [α, β, γ, δ]

    # I think this is needed to be able to sample from the prior
    if !ismissing(times) && !ismissing(prob)
        # make sure tspan is large enough
        newprob = remake(prob; p = p, tspan = [0; minimum(times)])

        predicted = solve(newprob, Tsit5();
                          # seems like I cannot adjust tspan here?
                          # my particular solution needs t = 0 to be included
                          saveat = [0; times],
                          # simulate full system but save only second state
                          save_idxs = 2)

        # Observations
        data ~ MvNormal(predicted.u, σ^2 * I)
    end

    return nothing
end

model2_prior = fitlv2(missing, prob)
prior2 = sample(model2_prior, NUTS(), MCMCSerial(), 1000, 3; progress=false)

model2 = fitlv2(times, prob) | (data = odedata[2,:], )
chain2 = sample(model2, NUTS(), MCMCSerial(), 1000, 3; progress=false)


# finally, our new addition to the model:
# what if there is some uncertainty on times?
# (in this case this matters because there is some external forcing function that depends on t)
@model function fitlvtau(times, prob)
    # Prior distributions.
    σ ~ InverseGamma(2, 3)
    α ~ truncated(Normal(1.5, 0.5); lower=0.5, upper=2.5)
    β ~ truncated(Normal(1.2, 0.5); lower=0, upper=2)
    γ ~ truncated(Normal(3.0, 0.5); lower=1, upper=4)
    δ ~ truncated(Normal(1.0, 0.5); lower=0, upper=2)
    τ ~ Uniform(0, 0.3) # tau is the uncertainty on the time of the sample

    # I also tried the below but I'm confused here...
    # τ ~ filldist(truncated(Normal(0, 0.3); lower = 0, upper = 1),
                 # length(data))

    if !ismissing(times)
        # set up our ages with uncertainty added
        # T = [0; times .+ τ] # first attempt
        # ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Ta

        # T = times .+ τ # what if it's the [0;] part?
        # ERROR: MethodError: no method matching zero(::Base.TwicePrecision{Float64})

        # then I tried setting it up like this:
        # for i, t in enumerate(times)
        #     T[i] ~ Normal(t, τ^2)
        # end
        # and then switched to MvN for speed...
        T ~ MvNormal(times, τ * I)
        # which fails with
        # ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{DynamicPPL.DynamicPPLTag, Float64}, Float64, 12})

        # Simulate Lotka-Volterra model.
        p = [α, β, γ, δ]

        # make sure tspan is large enough
        newprob = remake(prob; p = p, tspan = [0; minimum(times)])

        predicted = solve(newprob, Tsit5();
                          saveat = T,
                          save_idxs = 2)
        # real example would now interpolate some external value
        # to timesteps of sol.t and then do some additional calculations here

        # Observations.
        data ~ MvNormal(predicted.u, σ * I)
    end
    return nothing
end

modtau_prior = fitlvtau(missing, prob)
chaintau_prior_miss = sample(modtau_prior, NUTS(), MCMCSerial(), 1000, 3)
# this works

modtau_prior = fitlvtau(times, prob)
chaintau_prior = sample(modtau_prior, NUTS(), MCMCSerial(), 1000, 3)

# fails with the dual cache problem
# ERROR: MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{DynamicPPL.DynamicPPLTag, Float64}, Float64, 12})

modtau = fitlvtau(times, prob) | (data = odedata[2,:], )
chaintau = sample(modtau, NUTS(), MCMCSerial(), 1000, 3)

# fails with this error:
# ERROR: DimensionMismatch: inconsistent array dimensions





# mess-around area
@model function fitlvtau(times, prob)
    # Prior distributions.
    σ ~ InverseGamma(2, 3)
    α ~ truncated(Normal(1.5, 0.5); lower=0.5, upper=2.5)
    β ~ truncated(Normal(1.2, 0.5); lower=0, upper=2)
    γ ~ truncated(Normal(3.0, 0.5); lower=1, upper=4)
    δ ~ truncated(Normal(1.0, 0.5); lower=0, upper=2)
    τ ~ Uniform(0, 0.3) # tau is the uncertainty on the time of the sample

    # I also tried the below but I'm confused here...
    # τ ~ filldist(truncated(Normal(0, 0.3); lower = 0, upper = 1),
                 # length(data))

    if !ismissing(times)
        # set up our ages with uncertainty added
        # first I tried simply times .+ τ but this also didn't work
        # then I tried setting it up like this:
        # for i, t in enumerate(times)
        #     T[i] ~ Normal(t, τ^2)
        # end
        # and then switched to MvN for speed...
        # T ~ MvNormal(times, τ * I)

        # Simulate Lotka-Volterra model.
        p = [α, β, γ, δ]

        # make sure tspan is large enough
        predicted = solve(prob, Tsit5();
                          saveat = T, # <- this is where the error points
                          save_idxs = 2)
        # real example would now interpolate some external value
        # to timesteps of sol.t and then do some additional calculations here

        # Observations.
        data ~ MvNormal(predicted.u, σ * I)
    end
    return nothing
end

If i understand correctly, you are basically trying to fit a model that incorporates some type of measurement error in the time measurements? If that’s the case and you merely are trying to fit that parameter this I was able to sample from:

using Turing
using DifferentialEquations
using Distributions
using Random
using LinearAlgebra

# Set a seed for reproducibility
Random.seed!(14)

# Define Lotka-Volterra model
function lotka_volterra(du, u, p, t)
   α, β, γ, δ = p
   x, y = u
   du[1] = (α - β * y) * x # prey
   du[2] = (δ * x - γ) * y # predator
end

# Define initial-value problem
u0 = [1.0, 1.0]
p_true = [1.5, 1.0, 3.0, 1.0]
tspan = (0.0, 10.0)
prob = ODEProblem(lotka_volterra, u0, tspan, p_true)

# Generate synthetic data
times = 0.0:0.1:10.0
sol = solve(prob, Tsit5(); saveat=times)
odedata = Array(sol) + 0.1 * randn(size(Array(sol)))

@model function fitlv_time_uncertainty(times, prob, data)
    # Prior distributions
    σ ~ InverseGamma(2, 3)
    α ~ truncated(Normal(1.5, 0.5), 0.5, 2.5)
    β ~ truncated(Normal(1.2, 0.5), 0, 2)
    γ ~ truncated(Normal(3.0, 0.5), 1, 4)
    δ ~ truncated(Normal(1.0, 0.5), 0, 2)
    τ ~ Uniform(0, 0.3)

    # Time uncertainty
    T = times .+ rand(Normal(0, τ), length(times))

    # Simulate Lotka-Volterra model
    p = [α, β, γ, δ]
    predicted = solve(remake(prob, p=p, tspan=(0.0, maximum(T))), Tsit5(); saveat=T)

   # Observations
   for i in 1:length(times)
        data[:, i] ~ MvNormal(predicted(T[i]), σ^2 * I)
   end
end

# Fit the model
model = fitlv_time_uncertainty(times, prob, odedata)
chain = sample(model, NUTS(), 1000)

I essentially just assumed you were sampling time descrepancies from a 0 mean normal with some standard deviation parameter tau. Hope this is useful.

1 Like

Oh that’s great! This works for me too. It does make it way slower (217.04 seconds on my laptop) but that makes sense, because it needs to sample the ODE more often to figure out which one best matches.

Could you explain why you add the rand(Normal(0, τ), N) call here? To sort of “instantiate” the distribution? Because it’s not possible to add a vector to a Distribution element wise?

I thought this meant a simpler T = times .+ rand(τ, length(times) would also work, but it didn’t. But making it normal with a different \tau is fine too.

How do I now access to the T’s for each sample for later plotting purposes? I’ve tried making the function return T but this didn’t include it in the chain. I’ll have a look at the generated_quantities stuff that’s recently been added to figure this out tomorrow.

EDIT: by adding return (T, ) and then generated_quantities(model, chain) I got access to the T’s! This returned it as a 1000×1 Matrix{Tuple{Vector{Float64}}}:, so I’ll have to dig into getting them out somehow later.

I also didn’t realize I needed to add the Distributions and LinearAlgebra equations. Are the Distributions not exported by Turing? How do you figure out which dependencies you need when using a wrapper package?

Anyhow, thanks so much for having a look!

T = times .+ rand(τ, length(times)

This doesn’t work because rand expects a distribution. Here τ is just a scalar. I think there is some confusion on this part as this line that I wrote:

rand(Normal(0, τ), length(times)

Is just adding a vector of samples from a Normal distribution with mean 0 and standard deviation τ.

I also didn’t realize I needed to add the Distributions and LinearAlgebra equations. Are the Distributions not exported by Turing? How do you figure out which dependencies you need when using a wrapper package?

In general you always need to import packages if you need access to their functionality, even if a package is built on them. This is a little different than say PyMC for example because PyMC has its own version of distribution so you don’t need to import some other package like scipy to sample from.

1 Like

I’m not sure if calling rand inside a Turing model works as expected, i.e., at least when using NUTS, the model should specify a deterministic (and differentiable) log-posterior function.
Instead, I would try something like this:

@model function fitlv_time_uncertainty(times, prob, data)
    # Prior distributions
    σ ~ InverseGamma(2, 3)
    α ~ truncated(Normal(1.5, 0.5), 0.5, 2.5)
    β ~ truncated(Normal(1.2, 0.5), 0, 2)
    γ ~ truncated(Normal(3.0, 0.5), 1, 4)
    δ ~ truncated(Normal(1.0, 0.5), 0, 2)
    τ ~ Uniform(0, 0.3)

    # Simulate Lotka-Volterra model
    p = [α, β, γ, δ]
    predicted = solve(remake(prob, p=p, tspan=(-0.5, maximum(times) + 0.5)), Tsit5())

    # Time uncertainty
    ΔT ~ MvNormal(zeros(length(times)), τ)
    T = times .+ ΔT
           
    # Observations
    data .~ Normal.(stack(predicted.(T)), σ^2)
end

The idea behind this is that

  1. the time uncertainty should be sampled via Turing just like all other parameters.
  2. passing auto-diff types – which Turing will to internally for the parameters it samples – to tspan or saveat might not work as expected. Instead, the ODE is solved on a range large enough to cover the time span including some uncertainty
  3. calling predicted.(T) will given an inter-/extrapolation of the solution at the desired times T (including uncertainty). Further, auto-diff most likely works on the interpolation used by DifferentialEquations
2 Likes

Thanks for chipping in! I was already thinking that Turing didn’t know about this distribution now.

Actually the problem I had in mind was a bit simpler than an error term that gets added to each timestep separately. That would mean the uncertainty would have to be pretty small for each timestep, otherwise there is a large chance of getting overlapping times. For the application that I have in mind, there is a sequence of steps for which we sort of know that they are likely sorted correctly (there is little uncertainty on the relative time) but it might be that the whole record had a time that was off by a certain amount. So I wanted to model this with a single absolute uncertainty that allows “sliding” the age of all the timesteps by the same amount. That’s why I was trying to do directly: τ ~ Uniform(-0.3, 0.3) and then T = times .+ τ, but this gave me a dimension mismatch.

So maybe instead I should:

ΔT = repeat(tau, length(times))
T = times .+ ΔT

EDIT: I played around a bit, and thought the below could potentially work, but I’m a little lost with the introduction of element-wise .~, Normal. and predicted.. I also hadn’t seen the stack function yet (I’m pretty new to Julia) so I’ll have to dive into this more deeply.

T = times .+ repeat([τ], length(times))

times .+ τ works for me … are you sure you did not forget the .?
The . does broadcasting in Julia, i.e., applies a operator or function elementwise and expands scalars or singleton dimensions automatically across an array. repeat, fill or similar ways of reshaping scalars should usually be unnecessary in Julia.
In Turing, data .~ Normal.(means, σ) is the same use of brodcasting, i.e., is does the same as

for i eachindex(data)
    data[i] ~ Normal(means[i], σ)
end

stack is a utility function converting a vector of vectors into a matrix, e.g.,

julia> vvec = [rand(2) for i in 1:3]
3-element Vector{Vector{Float64}}:
 [0.10015916662622537, 0.3752249406490672]
 [0.5498648308057377, 0.6360195227989996]
 [0.5072838064466204, 0.9646335001052183]

julia> stack(vvec)
2×3 Matrix{Float64}:
 0.100159  0.549865  0.507284
 0.375225  0.63602   0.964634
1 Like

I hadn’t tried to just put in times .+ τ with your version that doesn’t use saveat but uses the interpolation function. Whoops.

This works! And thank you so much for all the explanations, it really helps!

Now I just need to figure out what else I can do to speed up my real use-case, because it’ll likely take over an hour to take 100 samples. I’m going to assume that making use of the interpolations is a little slower than using saveat directly because it needs to build interpolation functions and then apply them to our new timesteps, but there’s probably not much I can do there. After reading Performance Tips – Turing.jl I’ll ensure that I have type-stability. I’ve learned when building my forward model using the ODE system I had to make the function non-allocating by passing some placeholders as function arguments. Is this also something that would help speed up a Turing model? Where can I read more about optimization?

A performance tip: Use a big MvNormal instead of broadcasting below:

    data .~ Normal.(stack(predicted.(T)), σ^2)

That’s going to speed things up significantly. BTW, avoiding broadcasting is a general performance trick in Julia. Use @. whenever possible.