I am trying to fill in some gaps left from trying implement model comparison of a model fitted in Turing. Following McElreath’s Statistical Rethinking (2nd ed) and getting help from Claude.ai, I have come up with a function that calculates lppds for each observation, given posterior samples in chains
:
function my_waic(chains, X, Y)
dfc = @chain DataFrame(chains) begin
select([:iteration, :chain, :σ, :β₀, :β₁])
end
n = length(Y)
lppds = Vector{Float64}(undef, n)
pwaics = Vector{Float64}(undef, n)
for (i, (x, y)) in zip(X, Y) |> enumerate
mus = dfc[:, :β₀] + dfc[:, :β₁] .* x
log_pds = map((m, s) -> logpdf(Normal(m, s), y), mus, dfc[:, :σ])
# Calculate lppd
lppds[i] = logsumexp(log_pds) - log(length(log_pds))
# Calculate penalty term (effective number of parameters)
pwaics[i] = var(log_pds)
end
return lppds, pwaics
end
I can then use it as follows to get total WAIC and its standard error:
lppds, pwaics = my_waic(chains, train_X, train_Y)
elpd = sum(lppds) - sum(pwaics)
WAIC = -2 * elpd
WAIC_SE = sqrt(length(lppds) * var(-2 .* (lppds - pwaics)))
Which gives me
368.9568650726459
19.188152888480644
I did all this to try to understand the logic and the algorithm behind WAIC-based model comparison. I was aiming to recreate whatever ArviZ.waic
does, however, my results are only somewhat consistent with ArviZ’s:
ll = let
plls = pointwise_loglikelihoods(model, chains)
# Ensure the ordering of the loglikelihoods matches the ordering of `posterior_predictive`
log_likelihood_y = getindex.(Ref(plls), string.(keys(posterior_predictive)))
(; Y=cat(log_likelihood_y...; dims=3))
end
idata = ArviZ.from_mcmcchains(chains;
log_likelihood = ll,
library = "Turing",
)
waic(idata)
which gives me
WAICResult with estimates
elpd elpd_mcse p p_mcse
-1.8e+02 9.6 3.0 0.43
According to Claude.ai, I am on the right track since WAIC is -2 * elpd
.
I wonder if I am indeed on the right track with my “manual” computation of WAIC and WAIC_SE.
I would also like to know why ArviZ’s version returns values that are so crudely rounded off?