Query CDF and quantiles for MCMC Chains

I have a Chains object chain_normal:

julia> @show chain_normal
chain_normal = MCMC chain (1000×14×4 Array{Float64, 3})
Chains MCMC chain (1000×14×4 Array{Float64, 3}):

Iterations        = 501:1:1500
Number of chains  = 4
Samples per chain = 1000
Wall duration     = 15.27 seconds
Compute duration  = 60.7 seconds
parameters        = μ, σ
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
  parameters      mean       std   naive_se      mcse         ess      rhat   ess_per_s ⋯
      Symbol   Float64   Float64    Float64   Float64     Float64   Float64       Float ⋯

           μ   14.8588    0.2065     0.0033    0.0038   3281.5093    0.9998       54.05 ⋯
           σ    0.9170    0.1631     0.0026    0.0027   3109.1461    1.0002       51.21 ⋯
                                                                         1 column omitted

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64

           μ   14.4474   14.7244   14.8590   14.9937   15.2685
           σ    0.6617    0.8024    0.8948    1.0073    1.3041

I want to ask questions like “what is the probability that the mean is over 15”. To do so. I would calculate something like the following pseudocode:
1-cdf(chain_normal[:μ], 15)

That does of course not work. But how am I supposed to ask such questions about objects of type Chain, in a bayesean inference context?

Something functionally equivalent to what I want is as follows:

julia> let (chain, parameter, condition) = (chain_normal, :μ, >(15))
           counter = 0
           for sample in chain[parameter].data
               if condition(sample); counter += 1; end
           end
           return counter / length(chain_normal[:μ].data)
       end
0.24025

But that seems extremely wacky to have to code up, in order to answer what I imagine must be a quite common question about the results.

Regarding quantiles, it is quite easy to do what I want:

julia> quantile(vec(chain_normal[:μ].data), 0.5)
14.858987046238722

But it still feels more hacky than I would like, and I have an expectation that there should be a way to do this that feels more intended. Manually accessing the datafield and reshaping feel like operations that should be automatically encoded by an intended way to query quantiles from a Chain.

If you are going to ask questions like this a bunch you could create and ecdf object

myecdf = ecdf(chain_normal[:\mu])
1-myecdf(15)
1-myecdf(35)
...

If it’s a one off query, then

mean(chain_normal[:\mu] .> 15)

(I’m on my phone and don’t know how to get mu character…)

Nice! I did not know that I do not have to access the data field of the AxisArrays.

So I currently have the following methods:

# CDF or 1-CDF
julia> mean(chain_normal[:μ] .> 15)
0.24025

# quantiles
julia> quantile(chain_normal[:μ]|>vec, 0.5)
14.858987046238722

which are not bad at all in terms of number of characters to type out. Should examples like this be documented?

Some digging around the source lead me to

julia> quantile(chain_normal, q=[0.9])
Quantiles
  parameters     90.0% 
      Symbol   Float64

           μ   15.1039
           σ    1.1268
2 Likes