I am trying to fill in some gaps left from trying implement model comparison of a model fitted in Turing. Following McElreath’s Statistical Rethinking (2nd ed) and getting help from Claude.ai, I have come up with a function that calculates lppds for each observation, given posterior samples in chains:
function my_waic(chains, X, Y)
    dfc = @chain DataFrame(chains) begin
        select([:iteration, :chain, :σ, :β₀, :β₁])
    end
    n = length(Y)
    lppds = Vector{Float64}(undef, n)
    pwaics = Vector{Float64}(undef, n)
    for (i, (x, y)) in zip(X, Y) |> enumerate
        mus = dfc[:, :β₀] + dfc[:, :β₁] .* x
        log_pds = map((m, s) -> logpdf(Normal(m, s), y), mus, dfc[:, :σ])
        # Calculate lppd
        lppds[i] = logsumexp(log_pds) - log(length(log_pds))
        
        # Calculate penalty term (effective number of parameters)
        pwaics[i] = var(log_pds)
    end
    return lppds, pwaics
end
I can then use it as follows to get total WAIC and its standard error:
lppds, pwaics = my_waic(chains, train_X, train_Y)
elpd = sum(lppds) - sum(pwaics)
WAIC = -2 * elpd
WAIC_SE = sqrt(length(lppds) * var(-2 .* (lppds - pwaics)))
Which gives me
368.9568650726459
19.188152888480644
I did all this to try to understand the logic and the algorithm behind WAIC-based model comparison. I was aiming to recreate whatever ArviZ.waic does, however, my results are only somewhat consistent with ArviZ’s:
ll = let
    plls = pointwise_loglikelihoods(model, chains)
    # Ensure the ordering of the loglikelihoods matches the ordering of `posterior_predictive`
    log_likelihood_y = getindex.(Ref(plls), string.(keys(posterior_predictive)))
    (; Y=cat(log_likelihood_y...; dims=3))
end
idata = ArviZ.from_mcmcchains(chains;
    log_likelihood = ll,
    library = "Turing",
)
waic(idata)
which gives me
WAICResult with estimates
     elpd  elpd_mcse    p  p_mcse 
 -1.8e+02        9.6  3.0    0.43
According to Claude.ai, I am on the right track since WAIC is -2 * elpd.
I wonder if I am indeed on the right track with my “manual” computation of WAIC and WAIC_SE.
I would also like to know why ArviZ’s version returns values that are so crudely rounded off?