Turing's summary statistics - ESS

Hi,

I’m new to MCMC, and I’m using the Turing package to implement NUTS.

par_NUTS = sample(model, NUTS(1000, 0.65), MCMCThreads(), 1000, 3)

I’m trying to figure out what the Effective Sample Size (ESS) values returned in the summary statistics mean.

I’ve run 3 separate chains. When I check the ESS values for the individual chains and the combined chains they turn out to be identical:

summarystats(par_NUTS; append_chains=true) # Appended chains

image

summarystats(par_NUTS; append_chains=false)[1] # Individual chains

image

summarystats(par_NUTS; append_chains=false)[2] # Individual chains

image

summarystats(par_NUTS; append_chains=false)[3] # Individual chains

image

What does this mean? Is it the relative ESS values that are returned?

1 Like

The documentation of MCMCChains.rhat references the paper https://arxiv.org/pdf/1903.08008.pdf
from which it apparently takes the calculation of rhat (and ess?).
I have only skimmed over the first two pages just now, and hope I do understand correctly. The authors make the point that in general the convergence of MCMC cannot be reliably assessed from a single chain.

Thus, those observables are not calculated for every chain separately, which explains the identical values you observe. You may force a split by calling ess_rhat(chains[:,:,1]) etc., but again, those might not reliably report failed convergence and overestimate the ESS.

(P.S.: Please do not double post 'ESS' in Turing.jl if a question doesn’t receive any attention for a couple of days. That happens. Instead, you can bump it with a comment after a while.)

I think this is worth a bug report in MCMCChains. While an argument can be made that the R-hat computed for all chains should be returned for each chain, the same cannot be said for the ESS. It’s absolutely incorrect to return for a single chain the ESS or MCSE for all chains together. It’s probably best that MCMCChains then computes the ESS, R-hat, and MCSE for each chain separately. While this decreases the usefulness of the diagnostics, it avoids potential footguns.