A simple model which I can't get Turing to fit to simulated data

Having just started with Turing, I am unable to understand why this model does not produce good results.

using Turing
using MCMCChains

λ = Normal(0, 1)
data = rand.(Poisson.(exp.(rand(λ, 2^16))))

@model function m(y, σ²)
    λ ~ Normal(0, σ²)
    for i ∈ 1:length(y)
        y[i] ~ Poisson(exp(λ))
    end
end

model = m(data, 1)
sampler = NUTS(2^10, 0.95)
chain = sample(model, sampler, 2^12)

I expected the posterior for λ to be Normal(0, 1), since that was used for the simulated data.

The actual result is actually fairly consistently something like: Normal(0.5, 0.03) and I don’t understand why that would be the case.

Can someone point me in the right direction? Is the model I’ve written the wrong one for the simulated data?

Yes, that’s right. The model you used for simulation is

\begin{aligned} \lambda_i &\sim \mathrm{Normal}(0, 1)\\ y_i &\sim \mathrm{Poisson}(e^{\lambda_i}) \end{aligned},

but the model you have written with @model is

\begin{aligned} \lambda &\sim \mathrm{Normal}(0, 1)\\ y_i &\sim \mathrm{Poisson}(e^{\lambda}) \end{aligned}.

Note that the difference is that the Turing model uses a single global \lambda, while in the simulation model, each datum has its own \lambda_i.

But even if you fix that, your posterior would not in general look like the prior. The posterior conditioned on prior predictions only looks like the prior when marginalizing over the prior, i.e. when using the following procedure:

  1. Draw a single parameter \theta from the prior.
  2. Draw a single dataset y | \theta
  3. Draw \tilde{\theta} from the posterior conditioned on y
  4. Repeat steps 1-3 many times. Discard y and \theta. \tilde{\theta} will converge to the prior distribution.

This is the basis of the Simulation-Based Calibration method for checking if a specific inference method is incompatible with a given model.

1 Like