Bayesian model comparison / selection

uferc · May 3, 2022, 11:54am

Hello everyone,

my question revolves around the topic of Bayesian model comparison and henceforce selection.
I’ve build 3 alternative models (each NUTS(0.65), 4 chains, 8000 draws) and fit my data with the Turing.jl lib.
However I could not find a way in Turing to test loo characteristics and came across some discussions in the internet - from where I implemented the following for each model using the ArviZ interface:

ℓ = Turing.pointwise_loglikelihoods(param_mod, chains)
ℓ_mat = reduce(hcat, values(ℓ));
ℓ_arr = reshape(ℓ_mat, 1, size(ℓ_mat)...); # (chain_idx, sample_idx, parameter_idx)

data = ArviZ.from_mcmcchains(
    chains,
    library = "Turing",
    log_likelihood = Dict("y" => ℓ_arr)
)
criterion = ArviZ.loo(data)

Finally I build a Dict and fed it into the compare function:

compare_dict = Dict(
        "model 1" => criterion1,
        "model 2" => criterion2,
        "model 3" => criterion3,
)
compare("compare_dict", ic="loo")

I also tried to feed in the ArviZ interface data – which doesn’t work neither. I always get an none informative error message:

LoadError: PyError ($(Expr(:escape, :(ccall(#= C:\Users\xx\.julia\packages\PyCall\7a7w0\src\pyfncall.jl:43 =# @pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, pyargsptr, kw))))) <class 'AttributeError'>
AttributeError('Encountered error in ic computation of compare.')

Here in the documentation Reference · ArviZ.jl there is only a link "See documentation for arviz.compare " provided where I find just as little information about the function as on the previous page.

From previous discussions here in the forum I know about the StatisticalRethinkingJulia / ParetoSmoothedImportanceSampling.jl package but I couldn’t find a compare function.
From the discussion here Bayesian Model Selection - #8 by ANasc, I got that the ModelComparisons package was in the end not added to the MCMCChains repo…?

Any recommendation how to approach this problem? How can I find out how the compare-function works, which argument it takes in which format? I there any other package that includes model comparison?

I am very happy about any recommendation!

Christopher_Fisher · May 3, 2022, 12:10pm

I’m afraid that I don’t have an answer to your immediate question. However, another approach is to use ParetoSmooth.jl. This might be relevant: Home · ParetoSmooth.jl

sethaxen · May 3, 2022, 1:08pm

uferc:

Finally I build a Dict and fed it into the compare function:
compare_dict = Dict(
        "model 1" => criterion1,
        "model 2" => criterion2,
        "model 3" => criterion3,
)
compare("compare_dict", ic="loo")
I also tried to feed in the ArviZ interface data – which doesn’t work neither. I always get an none informative error message:
LoadError: PyError ($(Expr(:escape, :(ccall(#= C:\Users\xx\.julia\packages\PyCall\7a7w0\src\pyfncall.jl:43 =# @pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, pyargsptr, kw))))) <class 'AttributeError'>
AttributeError('Encountered error in ic computation of compare.')

Ah, thanks, this is a bug. I have opened an issue to track this. ArviZ converts the arviz.ELPDData returned by arviz.loo to a DataFrame on the Julia side, but the Python compare function expects an ELPDData input, and currently ArviZ doesn’t convert it back.

There are 2 quick workarounds.

First, we’ll load an example InferenceData:

julia> using ArviZ

julia> idata = load_arviz_data("radon");

Option 1: pass a Dict of InferenceData to compare:

julia> compare_dict1 = Dict(
           "model 1" => idata,
           "model 2" => idata,
           "model 3" => idata,
       );

julia> compare(compare_dict1; ic="loo")
3×10 DataFrame
 Row │ name     rank   loo       p_loo    d_loo    weight    se       dse      warning  loo_scale 
     │ String   Int64  Float64   Float64  Float64  Float64   Float64  Float64  Bool     String    
─────┼────────────────────────────────────────────────────────────────────────────────────────────
   1 │ model 1      0  -1027.12  26.7643      0.0  0.333333  28.8476      0.0    false  log
   2 │ model 3      1  -1027.12  26.7643      0.0  0.333333  28.8476      0.0    false  log
   3 │ model 2      2  -1027.12  26.7643      0.0  0.333333  28.8476      0.0    false  log

Option 2: bypass ArviZ.jl’s conversion to DataFrame by calling the Python function directly:

julia> loo_result = ArviZ.arviz.loo(idata)
PyObject Computed from 2000 by 919 log-likelihood matrix

         Estimate       SE
elpd_loo -1027.12    28.85
p_loo       26.76        -
------

Pareto k diagnostic values:
                         Count   Pct.
(-Inf, 0.5]   (good)      919  100.0%
 (0.5, 0.7]   (ok)          0    0.0%
   (0.7, 1]   (bad)         0    0.0%
   (1, Inf)   (very bad)    0    0.0%


julia> compare_dict2 = Dict(
           "model 1" => loo_result,
           "model 2" => loo_result,
           "model 3" => loo_result,
       );

julia> compare(compare_dict2; ic="loo")
3×10 DataFrame
 Row │ name     rank   loo       p_loo    d_loo    weight    se       dse      warning  loo_scale 
     │ String   Int64  Float64   Float64  Float64  Float64   Float64  Float64  Bool     String    
─────┼────────────────────────────────────────────────────────────────────────────────────────────
   1 │ model 1      0  -1027.12  26.7643      0.0  0.333333  28.8476      0.0    false  log
   2 │ model 3      1  -1027.12  26.7643      0.0  0.333333  28.8476      0.0    false  log
   3 │ model 2      2  -1027.12  26.7643      0.0  0.333333  28.8476      0.0    false  log

Whenever ArviZ.jl wraps a Python function with no modification, rather than manually convert all docs to Julia and try to keep them up-to-date, we refer the users to the Python docs. In the future these will all be pure Julia functions with their own documentation.

uferc · May 3, 2022, 2:40pm

Thank you so much @sethaxen for the fast reply! I tried the second workaround and got my result. Thanks

Topic		Replies	Views
WAIC, or LOO etc in Turing? Probabilistic Programming	11	4499	October 1, 2020
Compare two Turing models with LOO: SE is NaN Probabilistic Programming turing , bayesian-inference	6	403	August 2, 2023
Bayesian Model Selection Statistics turing	18	2609	May 2, 2021
Loo cv to Bayesian Estimation of Differential Equations (challenge) General Usage turing , paretosmooth	11	455	January 11, 2023
How to use ParetoSmooth.jl Probabilistic Programming turing , paretosmooth	12	3130	November 19, 2022

Bayesian model comparison / selection

Related topics