Likelihood from Turing model

Hello all,

From the following Turing ODE tutorial: https://turing.ml/v0.22/tutorials/10-bayesian-differential-equations/

I wonder if there is an easy way to compute the likelihood for each ode model variable, directly from Turing or manually using the mcmc chains?

Many thanks.

@model function fitlv(data, prob)
    # Prior distributions.
    σ ~ InverseGamma(2, 3)
    α ~ truncated(Normal(1.5, 0.5); lower=0.5, upper=2.5)
    β ~ truncated(Normal(1.2, 0.5); lower=0, upper=2)
    γ ~ truncated(Normal(3.0, 0.5); lower=1, upper=4)
    δ ~ truncated(Normal(1.0, 0.5); lower=0, upper=2)

    # Simulate Lotka-Volterra model. 
    p = [α, β, γ, δ]
    predicted = solve(prob, Tsit5(); p=p, saveat=0.1)

    # Observations.
    for i in 1:length(predicted)
        data[:, i] ~ MvNormal(predicted[i], σ^2 * I)
    end

    return nothing
end

model = fitlv(odedata, prob)

# Sample 3 independent chains with forward-mode automatic differentiation (the default).
chain = sample(model, NUTS(0.65), MCMCSerial(), 1000, 3; progress=false)

Can you clarify what you mean by the “likelihood for each ode model variable?”

Sure! In this tutorial there are two model variables, I think refered as prey and predators associates with to sets of data. If we integrate the model with the mcmc chains, we obtain two sets of fits, one for prey and one for predators. I would like to obtain the likelihood for each of those fits for both prey and predators.

I’m still not quite clear what you mean. In the example, the data consists of two time series, one measuring prey abundance, and one measuring predator abundance. The parameters of the system which you are fitting are alpha (population growth rate of prey), beta (per-predator predation rate), gamma (population growth of predator per prey eaten) and delta (death rate of predators). In the first example the model (fitlv) is fitted using both the predator and prey time series, and in the second example (fitlv2) the prey time series is omitted, but you’re still inferring the values of those same four parameters in each case.

When you say “likelihood,” do you mean in the statistical sense (i.e., P(data | parameters))? Or are you asking about the posterior probabilities for the parameters, or the predator/prey population densities?

Sorry for the confusion, maybe I am confused on how Turing is computing the likelihood. My understanding is that in the fitlv function from the tutorial the likelihood P(data|parameters) is computed here:

data[:, i] ~ MvNormal(predicted[i], σ^2 * I)

But what I want is to compare different models and then compare the likelihood of each model using the posteriors probabilities for the parameters. So my idea is to integrate the model after the mcmc using the posteriors probabilities for the model parameters and then compute the likelihood to calculate indicators like AIC. I am not sure how to compute the likelihood at this stage. Is that clearer?

Ah, yes, much clearer! The chain contains a field :log_density, which records the log-posterior value for each sample. If you want the log-likelihood instead, you can calculate it after the fact using the loglikelihood function from Turing:

using Turing, DataFrames

@model function demo(x)
    m ~ Normal()
    s ~ Exponential()
    x .~ Normal(m, s)
end

mod = demo(randn(100))
chain = sample(mod, NUTS(), 1000)
loglikelihood(mod, (m=1.0, s=1.0)) # for a single set of parameters
ll = [loglikelihood(mod, row) for row in eachrow(DataFrame(chain))] # for each sample in chain
all(ll .> chain[:log_density]) # true

Note loglikelihood expects the second argument to have named fields for each of the parameter

1 Like

Many thanks! Exactly what I needed!