What is the interpretation of Turing's std, naive_se, mcse?

An MCMC chain summary gives

Summary Statistics
  parameters      mean       std   naive_se      mcse         ess      rhat 
      Symbol   Float64   Float64    Float64   Float64     Float64   Float64 

What is the difference between std, naive_se, and mcse in terms of:

  • how they are computed
  • how they should be interpreted
  • how they should be used/reported

Thanks.

1 Like

The Effective Sample Size (ESS) and Monte Carlo Standard Error (MCSE) are described in Kai Xu’s thesis at page 16:

Effective Sample Size ESS is a measure of how well a continuous chain is mixing. For a give chain \{x_i\}_{1:n}, ESS is defined by

\text{ESS} = \frac{n}{1 + \Sigma_{k=1}^\infty \rho_k},

where n is the total number of samples in the chain and \rho_k is the autocorrelation factor at lag k of the chain [11].

An ESS measures how many samples are effective in the chain, the larger this value is, the higher the sampling efficiency. Different MCMC samplers can be evaluated by generating the same number of samples and comparing the ESS for each sampling results.

Monte Carlo Standard Error MCSE is an estimate of the inaccuracy of MC samples. There are multiple ways to estimate MCSE, among which the batch mean method proposed in [12] is believed to be the most popular one.

[…]

As MCSE measures the inaccuracy of MC samples, a smaller value of MCSE is an indicator of better sampling performance.

However, it has been argued that MCSE is generally unimportant when the goal of inference is parameters themselves rather than the expectation of parameters, in which case the ESS is would be a more important measure [13].

References

[11] Dr. Orlaith Burke. Statistical Methods Autocorrelation: MCMC Output Analysis. Department of Statistics, University of Oxford, 2012

[12] James M Flegal, Murali Haran, and Galin L Jones. Markov chain monte carlo: Can we trust the third significant figure? Statistical Science, pages250–260, 2008.

[13] Andrew Gelman, John B Carlin, Hal S Stern, and Donald B Rubin. Bayesian data analysis. texts in statistical science series, 2004

2 Likes

The naive Standard Error (naive_se), according to Rufo on StackExchange, is

a measure of the computational MCMC error for the estimation of the posterior expected value of a parameter.

[…]

If we dig a little bit in the R functions summary.mcmc.list from the package coda (which uses function safespec0 and this one uses spectrum0.ar in its turn) we find that the definition of the naive SE is:

SE_{Naive} = \sqrt{\frac{Var(X)}{C \cdot S}}

with C being the number of run chains, X = \{X^{(c)}\} being the vector of posterior samples from a certain parameter (concatenation all the chains, c \in 1, ..., C), and S being the length (the number of iterations) of each chain.

r_hat seems to be the Gelman-Rubin \hat{R}, but I’m not sure what is exactly implemented by Turing since the equation has changed a few times.

Finally, Cameron Pfiffer said you shouldn’t rely on ess, but instead on r_hat. r_hat is more reliable.

McElreath (2020) summarizes ess and rhat as ess being “a crude estimate of the number of independent samples you managed to get. Rhat (\hat{R}) is an indicator of the convergence of the Markov chains to the target distribution. It should approach 1.00 from above, when all is well.”

2 Likes