Bayesian model comparison / selection

Hello everyone,

my question revolves around the topic of Bayesian model comparison and henceforce selection.
I’ve build 3 alternative models (each NUTS(0.65), 4 chains, 8000 draws) and fit my data with the Turing.jl lib.
However I could not find a way in Turing to test loo characteristics and came across some discussions in the internet - from where I implemented the following for each model using the ArviZ interface:

ℓ = Turing.pointwise_loglikelihoods(param_mod, chains)
ℓ_mat = reduce(hcat, values(ℓ));
ℓ_arr = reshape(ℓ_mat, 1, size(ℓ_mat)...); # (chain_idx, sample_idx, parameter_idx)

data = ArviZ.from_mcmcchains(
    chains,
    library = "Turing",
    log_likelihood = Dict("y" => ℓ_arr)
)
criterion = ArviZ.loo(data)

Finally I build a Dict and fed it into the compare function:

compare_dict = Dict(
        "model 1" => criterion1,
        "model 2" => criterion2,
        "model 3" => criterion3,
)
compare("compare_dict", ic="loo")

I also tried to feed in the ArviZ interface data – which doesn’t work neither. I always get an none informative error message:

LoadError: PyError ($(Expr(:escape, :(ccall(#= C:\Users\xx\.julia\packages\PyCall\7a7w0\src\pyfncall.jl:43 =# @pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, pyargsptr, kw))))) <class 'AttributeError'>
AttributeError('Encountered error in ic computation of compare.')

Here in the documentation Reference · ArviZ.jl there is only a link "See documentation for arviz.compare " provided where I find just as little information about the function as on the previous page.

From previous discussions here in the forum I know about the StatisticalRethinkingJulia / ParetoSmoothedImportanceSampling.jl package but I couldn’t find a compare function.
From the discussion here Bayesian Model Selection - #8 by ANasc, I got that the ModelComparisons package was in the end not added to the MCMCChains repo…?

Any recommendation how to approach this problem? How can I find out how the compare-function works, which argument it takes in which format? I there any other package that includes model comparison?

I am very happy about any recommendation!

I’m afraid that I don’t have an answer to your immediate question. However, another approach is to use ParetoSmooth.jl. This might be relevant: Home · ParetoSmooth.jl

1 Like

Ah, thanks, this is a bug. I have opened an issue to track this. ArviZ converts the arviz.ELPDData returned by arviz.loo to a DataFrame on the Julia side, but the Python compare function expects an ELPDData input, and currently ArviZ doesn’t convert it back.

There are 2 quick workarounds.

First, we’ll load an example InferenceData:

julia> using ArviZ

julia> idata = load_arviz_data("radon");

Option 1: pass a Dict of InferenceData to compare:

julia> compare_dict1 = Dict(
           "model 1" => idata,
           "model 2" => idata,
           "model 3" => idata,
       );

julia> compare(compare_dict1; ic="loo")
3×10 DataFrame
 Row │ name     rank   loo       p_loo    d_loo    weight    se       dse      warning  loo_scale 
     │ String   Int64  Float64   Float64  Float64  Float64   Float64  Float64  Bool     String    
─────┼────────────────────────────────────────────────────────────────────────────────────────────
   1 │ model 1      0  -1027.12  26.7643      0.0  0.333333  28.8476      0.0    false  log
   2 │ model 3      1  -1027.12  26.7643      0.0  0.333333  28.8476      0.0    false  log
   3 │ model 2      2  -1027.12  26.7643      0.0  0.333333  28.8476      0.0    false  log

Option 2: bypass ArviZ.jl’s conversion to DataFrame by calling the Python function directly:

julia> loo_result = ArviZ.arviz.loo(idata)
PyObject Computed from 2000 by 919 log-likelihood matrix

         Estimate       SE
elpd_loo -1027.12    28.85
p_loo       26.76        -
------

Pareto k diagnostic values:
                         Count   Pct.
(-Inf, 0.5]   (good)      919  100.0%
 (0.5, 0.7]   (ok)          0    0.0%
   (0.7, 1]   (bad)         0    0.0%
   (1, Inf)   (very bad)    0    0.0%


julia> compare_dict2 = Dict(
           "model 1" => loo_result,
           "model 2" => loo_result,
           "model 3" => loo_result,
       );

julia> compare(compare_dict2; ic="loo")
3×10 DataFrame
 Row │ name     rank   loo       p_loo    d_loo    weight    se       dse      warning  loo_scale 
     │ String   Int64  Float64   Float64  Float64  Float64   Float64  Float64  Bool     String    
─────┼────────────────────────────────────────────────────────────────────────────────────────────
   1 │ model 1      0  -1027.12  26.7643      0.0  0.333333  28.8476      0.0    false  log
   2 │ model 3      1  -1027.12  26.7643      0.0  0.333333  28.8476      0.0    false  log
   3 │ model 2      2  -1027.12  26.7643      0.0  0.333333  28.8476      0.0    false  log

Whenever ArviZ.jl wraps a Python function with no modification, rather than manually convert all docs to Julia and try to keep them up-to-date, we refer the users to the Python docs. In the future these will all be pure Julia functions with their own documentation.

2 Likes

Thank you so much @sethaxen for the fast reply! I tried the second workaround and got my result. Thanks :hugs:

1 Like