ODE and Turing problems with sparse and transformed data

Dear Julia community,

I am new to julia and try to implement a complex bayesian inference procedure on a ODE system, using turing.

I base my code on the great turing tutorial: Bayesian Estimation of Differential Equations

While the simple case works, I would like to extend this tutorial to two more realistic problems for my research.

  1. Let’s say I have a Lotka-Volterra model with X and Y the variables. In this case, X and Y can be different by several magnitudes and then I want to log-transform my data and my predicted ode solution when estimating the likelihood.
    From the tutorial example, I tried the following code:
Turing.setadbackend(:forwarddiff)

@model function fitlv(data, prob1)
    σ ~ InverseGamma(2, 3) # ~ is the tilde character
    α ~ truncated(Normal(1.5,0.5),0.5,2.5)
    β ~ truncated(Normal(1.2,0.5),0,2)
    γ ~ truncated(Normal(3.0,0.5),1,4)
    δ ~ truncated(Normal(1.0,0.5),0,2)
    p = [α,β,γ,δ]
    prob = remake(prob1, p=p)
    predicted = solve(prob,Tsit5(),saveat=0.1)
   temp = predicted .+ 1 .- minimum(predicted)
    for i = 1:length(predicted)
        log_predicted = log10.(temp[i])
        data[:,i] ~ MvNormal(log_predicted, σ)
    end
end

model = fitlv(log10.(odedata .+ 1 .- minimum(odedata)), prob1)
chain = mapreduce(c -> sample(model, NUTS(.65),1000), chainscat, 1:3)

But I got some problems because given the set of parameters, predicted can be badly estimated by the ode solver and be <0, then log(predicted) can give a complex number and then the sampling stop … do you have any insights on what would be the best option to implement this?

  1. Now let’s say that X and Y data are not homogeneous and doesn’t have the same time vector and size. How do you implement this in turing? Espcially when estimating the likelihood because if X has 10x more data points than Y then the fit will be weighted on the X channel.

Many thanks for all your help!

That’s just going to happen because there are some parameters where the solution can go to zero. One way to help this out is to decrease the solver tolerance solve(prob,Vern9(),saveat=0.1,abstol=1e-10,reltol=1e-10).

Just solve using saveat effectively, or use the post solution interpolation.

Hi Chris, Thanks a lot for your answer. I solved (1). But I am struggling with (2).
I used both options (saveat or interpolation), but my sampling is not working and I think my likelihoods statements are problematic. Instead of using MvNormal, I use Normal for my two variables X and Y as follows:

# Likelihood for X
for i = 1:length(tobs_X)
  log_predicted_X = log.(predicted_X[i])
  data_X[i] ~ Normal(log_predicted_X, σ_X)
end

# Likelihood for Y
for i = 1:length(tobs_Y)
  log_predicted_Y = log.(predicted_Y[i])
  data_Y[i] ~ Normal(log_predicted_Y, σ_Y)
end

The sampling is running forever with this kind of warning:

┌ Warning: The current proposal will be rejected due to numerical error(s).
│   isfinite.((θ, r, ℓπ, ℓκ)) = (true, true, false, true)
└ @ AdvancedHMC ~/.julia/packages/AdvancedHMC/yd6UP/src/hamiltonian.jl:47

Remember that here X and Y don’t have the same time obs sampling nor size.

Which values are infinite? Print and see

Actually, It was because I had NaN in my data … So now it is working.
But I am wondering if having n data points for X and m data points for Y with n >>> m will not weigh X compare to Y?

Then just make σ_X smaller than σ_Y: the variance of the likelihood is the inverse of the weight.

1 Like

Thanks a lot, Chris!

I think you can @addlogprob! -Inf and return immediately and skip the logarithm step, this will cause the sampler to reject the proposal