Hi,
I’ve been trying to use the predict
function in Turing
for Posterior Prediction as per here: https://discourse.julialang.org/t/ann-turing-0-13-0-mle-map-and-prediction/40033#prediction-2
Based on that, I tried applying the same syntax but on this example: https://storopoli.io/Bayesian-Julia/pages/11_multilevel_models/#random-intercept_model
using Pkg
Pkg.activate(".")
using Turing
using StatsBase
using LinearAlgebra
using Statistics: mean, std
using Random: seed!
seed!(123)
using DataFrames
using CSV
using HTTP
url = "https://raw.githubusercontent.com/storopoli/Bayesian-Julia/master/datasets/cheese.csv"
cheese = CSV.read(HTTP.get(url).body, DataFrame)
describe(cheese)
for c in unique(cheese[:, :cheese])
cheese[:, "cheese_$c"] = ifelse.(cheese[:, :cheese] .== c, 1, 0)
end
cheese[:, :background_int] = map(cheese[:, :background]) do b
if b == "rural"
1
elseif b == "urban"
2
else
missing
end
end
first(cheese, 5)
@model function varying_intercept(
X, idx, y; n_gr=length(unique(idx)), predictors=size(X, 2)
)
#priors
α ~ Normal(mean(y), 2.5 * std(y)) # population-level intercept
β ~ filldist(Normal(0, 2), predictors) # population-level coefficients
σ ~ Exponential(std(y)) # residual SD
#prior for variance of random intercepts
#usually requires thoughtful specification
τ ~ truncated(Cauchy(0, 2); lower=0) # group-level SDs intercepts
αⱼ ~ filldist(Normal(0, τ), n_gr) # group-level intercepts
#likelihood
ŷ = α .+ X * β .+ αⱼ[idx]
return y ~ MvNormal(ŷ, σ^2 * I)
end;
X = Matrix(select(cheese, Between(:cheese_A, :cheese_D)));
y = cheese[:, :y];
idx = cheese[:, :background_int];
model_intercept = varying_intercept(X, idx, y)
chain_intercept = sample(model_intercept, NUTS(), MCMCThreads(), 1_000, 4)
println(DataFrame(summarystats(chain_intercept)))
pred_model = varying_intercept(X, idx, Vector{Union{Missing, Float64}}(undef, length(y)) )
predictions = StatsBase.predict(pred_model, chain_intercept)
# c = sample(pred_model, NUTS(), 1_000)
Essentially I tried using the predict
function on the same dataset but with Missing
y-values just to get a taste on the syntax, but I’ve got the following error:
ERROR: MethodError: no method matching Normal(::Missing, ::Missing)
I’ve seen other posts raising issues with the MvNormal
function: https://discourse.julialang.org/t/turing-jl-how-to-generate-samples-observations-from-a-model/95258/2, but that was some time ago, and thought that might have been addressed.
I’ve also tried following the Guide: https://turinglang.org/dev/docs/using-turing/guide/#treating-observations-as-random-variables, and used the sample
function instead (un-commenting out the final line c = sample(pred_model, NUTS(), 1_000)
and commenting out the StatsBase.predict
line in the block above)
Any hints as to how I can get the posterior prediction to work here?
Thanks in advance!