# Differences in using MvNormal and Normal in the calculation of the likelihood

Hello all,

I have a question related the ODE Turing tutorial:

``````using Turing, DifferentialEquations, StatsPlots, LinearAlgebra, Random

Random.seed!(14);

function lotka_volterra(du, u, p, t)
# Model parameters.
α, β, γ, δ = p
# Current state.
x, y = u

# Evaluate differential equations.
du[1] = (α - β * y) * x # prey
du[2] = (δ * x - γ) * y # predator

return nothing
end

# Define initial-value problem.
u0 = [1.0, 1.0]
p = [1.5, 1.0, 3.0, 1.0]
tspan = (0.0, 10.0)
prob = ODEProblem(lotka_volterra, u0, tspan, p)

sol = solve(prob, Tsit5(); saveat=0.1)
odedata = Array(sol) + 0.8 * randn(size(Array(sol)))

@model function fitlv(data, prob)
# Prior distributions.
σ ~ InverseGamma(2, 3)
α ~ truncated(Normal(1.5, 0.5); lower=0.5, upper=2.5)
β ~ truncated(Normal(1.2, 0.5); lower=0,   upper=2)
γ ~ truncated(Normal(3.0, 0.5); lower=1,   upper=4)
δ ~ truncated(Normal(1.0, 0.5); lower=0,   upper=2)

# Simulate Lotka-Volterra model.
p = [α, β, γ, δ]
predicted = solve(prob, Tsit5(); p=p, saveat=0.1)

# Observations.
for i in 1:length(predicted)
data[:, i] ~ MvNormal(predicted[i], σ^2 * I)
end
return nothing
end

model = fitlv(odedata, prob)

# Sample 3 independent chains with forward-mode automatic differentiation (the default).
@time chain = sample(model, NUTS(), MCMCSerial(), 500, 3; progress=true)
``````

Now imagine that we do not have the same amount of observations for preys and predators in `odedata`. We will have missing values and I have read it is quite problematic when calculating the likelihood.

Is it possible to just concatenate the predators and prey and use `Normal` instead of `MvNormal` as follows?

Thanks

``````@model function fitlv_test(data, prob)
# Prior distributions.
σ ~ InverseGamma(2, 3)
α ~ truncated(Normal(1.5, 0.5); lower=0.5, upper=2.5)
β ~ truncated(Normal(1.2, 0.5); lower=0,   upper=2)
γ ~ truncated(Normal(3.0, 0.5); lower=1,   upper=4)
δ ~ truncated(Normal(1.0, 0.5); lower=0,   upper=2)

# Simulate Lotka-Volterra model.
p = [α, β, γ, δ]
predicted = solve(prob, Tsit5(); p=p, saveat=0.1)
pred = vcat(predicted[1,:],predicted[2,:]);

# Observations.
for i in 1:length(pred)
data[i] ~ Normal(pred[i], σ^2)
end
return nothing
end

data_test = vcat(odedata[1,:],odedata[2,:])
model_test = fitlv_test(data_test, prob)

@time chain_test = sample(model_test, NUTS(), MCMCSerial(), 500, 3; progress=true)
``````

Yes, this is equivalent in terms of probability:

``````julia> using Distributions

julia> logpdf(MvNormal([1 0; 0 1]), [1.0, -2.0])
-4.337877066409345

julia> logpdf(Normal(), 1.0) + logpdf(Normal(), -2)
-4.337877066409345
``````

Depending on how many missing values you have for each species and how they’re arranged, it may make more sense to keep the data and ODE solutions as arrays, and do something like this inside your model:

``````for i in findall(!ismissing, data)
data[i] ~ Normal(pred[i], σ)
end
``````

There may be a more efficient vectorized way if you’re using reverse-mode AD, but for five parameters it shouldn’t be an issue and this is a clear way to express what the model is doing.

Amazing, thanks a lot @ElOceanografo.

Another related question: imagine you want different `σ` for your two variables, how Turing is estimating the “global” likelihood of your problem if you have separate distributions like as follows?
Thanks a lot.

``````for i in findall(!ismissing, data)
data_prey[i] ~ Normal(pred_prey[i], σ_prey^2)
data_predators[i] ~ Normal(pred_predators[i], σ_predators^2)
end
``````

The global likelihood is just the product of all the individual conditional likelihoods in the model (though it’s actually calculated as a sum of log-likelihoods). That’s what Turing is doing inside the `@model` macro: each tilde statement `value ~ SomeDistribution` basically gets translated to `logpdf(SomeDistribution, value)`, and they’re all added up to get the total log-likelihood (or, if we’re including the priors, the unnormalized log-posterior).

If you have different observation errors for the predator and prey, it just means the model will weigh the goodness-of-fit to the predator and prey time series differently. If those observation errors are themselves parameters in the model, it will try to fit them as well–the log-likelihood will be highest when they are correct, which is easy to check for yourself:

``````julia> using Distributions, Random

julia> Random.seed!(1)

julia> x = rand(Normal(0.0, 1.0), 100);

julia> loglikelihood(Normal(0.0, 0.5), x) # too low
-216.49103411053034

julia> loglikelihood(Normal(0.0, 1.0), x) # just right
-140.37182803198166

julia> loglikelihood(Normal(0.0, 1.5), x) # too high
-153.98613066973462
``````

Finally, I just noticed you’ve been writing `Normal(pred[i], σ^2)`, which should be `Normal(pred[i], σ)`, since Julia’s normal distribution is parameterized in terms of the standard deviation, not the variance. The model will still run that way, but you may get confused interpreting the results.

1 Like

@ElOceanografo, thanks a lot for these useful remarks.

regarding the following comment:

Yes it was a mistake, I have been confused because when using the `MvNormal` distribution, it is setup using variance (for the covariance matrix) while when using a `Normal` distribution it set up using the `σ`.