# [Turing.jl] How to generate samples ("observations") from a model

Hello,
Let say I define a Turing model and would like to use it to generate not only from prior but also some associated `y` observations. Here is an example:

``````@model function mymodel(y::AbstractVector{F}) where {F<:Real}
T = length(y)
θ ~ Uniform(0,1)
y ~ MvNormal(θ * Diagonal(ones(T)))
return y # BTW I am not sure if this last line is useful
end
``````

If I do `rand(mymodel(y))` for some

``````y = randn(10) # for rand(mymodel) y values do no matter I believe
``````

I get a sample for `θ` from its prior (uniform here).
However, here I would like to get an associated realization for `y` i.e., `y_realization = MvNormal(θ * Diagonal(ones(T)))`.

Of course, I hand made code the sampler function for `y` but it seems a bit redundant since it is already in my model.
I tried looking into various function of `DynamicPPL.jl` but I cannot make sense how to extract and use the
`y ~ MvNormal(θ * Diagonal(ones(T)))` line. If this function already exists, I missed it.

``````@model function mymodel2(y::AbstractVector{F}) where {F<:Real}
T = length(y)
θ ~ Uniform(0,1)
y ~ MvNormal(θ * Diagonal(ones(T)))
return rand(MvNormal(θ * Diagonal(ones(T))))
end
``````

with

``````t = rand(mymodel(rand(10)))
param = (; θ = t)
generated_quantities(mymodel2(y), param)
``````

does the job.

However, it is not satisfying:

• When doing inference e.g. `chain2 = sample(mymodel2(y), MH(), 10000)`, the `rand` inside the model is useless and called every time the model is called.
• This is probably inefficient if ones wants to use `generated_quantities` `N`-time compared to `rand(MvNormal(θ * Diagonal(ones(T))), N)`.

If I’m reading this correctly what you want is the prior predictive distribution of your model.
Turing has the `predict` function you can use to get the prior predictive distribution for `y`.

In practice I’ve found this to be a bit hard to use - in fact I couldn’t get your model to work with the `MvNormal` but had to rewrite the distribtion as a loop.

How it is used:

First define the model:

``````@model function mymodel(y)
T = length(y)
θ ~ Uniform(0,1)

for n in eachindex(y)
y[n] ~ Normal(0, θ)
end
end
``````

Again, I’ve rewritten your `y ~ ...` as a for loop.
Then instantiate the model and sample from the prior for `θ`.

``````mod = mymodel(rand(100))
chn = sample(mod, Prior(), 1000)
``````

Next, use the predict function to get `y` values for the samples for `θ` in `chn`.

``````mod_prior_pred = mymodel(Vector{Union{Missing, Float64}}(undef, 100))
prior_pred = predict(mod_prior_pred, chn)
``````

The first step creates a new model with a missing vector as `y`.
These will be predicted by `predict`.

Note that this approach also works to get the posterior predictive distribution or predict new response values given some predictors (as in regression).
In this case we do not sample with `Prior()` but use an MCMC algorithm like `NUTS()` to get the posterior distribution. The rest of the example will be identical.

Thank you, I did not know about `predict`. This led me to the Turing guide where they also sample from Prior using `missing`.
It seems faster to use `predict` though than the method in the guide.

However, and as you said, it is not very convenient not to be able to use `MvNormal` because (for reference)

``````ERROR: MethodError: no method matching loglikelihood(::ZeroMeanDiagNormal{Tuple{Base.OneTo{Int64}}}, ::Vector{Union{Missing, Float64}})
``````

the `loglikelihood` for `MvNormal` is not defined for `missing` types.

I found this hack that seems to be working with the `MvNormal` version (but not the loop version!)

``````# my model
@model function mymodelMv(y::AbstractVector)
T = length(y)
θ ~ Uniform(0,1)
y ~ MvNormal(Diagonal(θ*ones(T)) )
end

# same but with no observation y
@model function mymodelMv(T::Int)
θ ~ Uniform(0,1)
y ~ MvNormal(Diagonal(θ*ones(T)) )
end

#
function mysampler(model, N)
y = model.args[:y]
n = length(y)

# model with no observations
mod_prior_pred = model.f(n)

# sample from Prior
chn = sample(model, Prior(), N)

# predict "missing" y with the chain of θ
prior_pred = predict(mod_prior_pred, chn)
θ = chn.value[:,1]
y = prior_pred.value
return y, θ
end
``````

Is there better suggestions working (efficiently) for generic Turing models of the function `mysampler(model, N)`?

This seems to be a known issue, see Sampling from prior using return value doesn't work with MvNormal · Issue #1559 · TuringLang/Turing.jl · GitHub.
I just tested it again and the example works if you pass a single `missing` value for y. Thinking about it, it also makes sense, since `y = missing` is treated as a single multivariate missing observation.

You can then write,

``````@model function mymodel(y, l = ismissing(y) ? 1 : length(y))
θ ~ Uniform(0,1)
y ~ MvNormal(zeros(l), θ)
end

mod = mymodel(rand(10))
chn = sample(mod, Prior(), 1000)

mod_prior_pred = mymodel(missing, 10)
prior_pred = predict(mod_prior_pred, chn)
``````

which is similar to your example, but

1. As described above, y is not a vector of missing observations, but a single vector-variate missing observation,
2. The second parameter `l` describing the dimension of the multivariate observation. This is important, because if `y` is missing, nothing is known about the dimensionality and `length(missing)` is not defined. So it defaults to 1 if y is missing, but to `length(y)` if y is non-missing.

For example to draw a 10-dimensional `y` observation:

``````mod_prior_pred = mymodel(missing, 10)
prior_pred = predict(mod_prior_pred, chn)
``````

Thanks, this works.

However, I wonder why this is much longer than just doing

``````function handmodel(N, l)
θ = zeros(N)
y = zeros(N, l)
for i in 1:N
θ[i] = rand(Uniform(0,1))
y[i,:] = rand(MvNormal(zeros(l), θ[i]))
end
return θ, y
end
``````
``````N = 1000
l = 10
@btime begin
chn = sample(mod, Prior(), N)
mod_prior_pred = mymodel(missing, l)
prior_pred = predict(mod_prior_pred, chn)
end
# 17.800 ms (392324 allocations: 21.96 MiB)
@btime handmodel(N, l)
# 122.900 μs (2004 allocations: 367.39 KiB)
``````

Is it because how Turing stores a lot of (useless in this case) info along the chain? It seems like a lot for just sampling from prior. Or is there an option I do not see?

Ideally, I would like to have an “efficient” sampler like `handmodel` using only `rand` where the input is `Turing.jl` model (because syntax is nice and if I want to use other sampling methods like NUTS(), MH() I can easily switch.

Hence, I believe the task is to extract the distributions and conditional structure (I did not succeed).

I am no expert on the Turing internals, but I think a good portion of the runtime - at least for this use case - is spent on constructing the output object. So if efficiency is an important criterion for you it is probably best to write your on prior predictive distribution just like you did with `handmodel`.