Turing - Syntax for Compositional Gibbs

Hi everyone,

I just finished reading the paper for the Julia Turing module and went through your tutorials – looks really great, congratulations! I did not find a specific forum for it, so I post my question here:

In your tutorials, you have a HMM example included here: https://turing.ml/tutorials/4-bayeshmm/

I will abuse notation a bit and not include indices. You have some evidence e, and want the joint posterior of parameter \theta=\{m,T\} and state trajectories s, P(\theta, s | e). You initialize the sampler via the command:

g = Gibbs(1000, HMC(2, 0.001, 7, :m, :T), PG(20, 1, :s) )

Now, my actual question. I really like your concept of compositional Gibbs, and this sampler runs really well, but what you really do there is running a Particle Gibbs algorithm via (1) particle filtering the states s and (2) jointly sampling \theta via HMC + correction step. It cannot be a Gibbs sampler as you use a particle approximation for the trajectory, and as you only use 20 particle trajectories, I don’t think you apply any form of correction here because your chain is never stuck and works perfectly fine for this example.

I kind of get that you want to unify the commands you need to know for inference, but this looks kind of weird, because you call PG inside the Gibbs sampler, but at the end you perform Particle Gibbs, not a Gibbs sampler, right? They are really different beasts. I have seen that you have a SMC.jl file inside your Turing module, wouldn’t it make much more sense to write something like this for state space models:

g = PG( YOUR_SMC_METHOD(:s,…), YOUR_MCMC_METHOD(:theta,…) )
#or more general
g = ParticleMCMC( YOUR_SMC_METHOD(:s,…), YOUR_MCMC_METHOD(:theta,…) )

Apologies if I somehow misinterpreted it, and thank you for your input!

Best regards

Edit: Just to clarify - the sampler worked perfectly fine, I just wonder about the syntax when you combine various MC methods.

1 Like

I’m not actually sure here about the terminology, but I can tell you exactly what’s going on under the hood.

Let’s take the above case:

g = Gibbs(1000, HMC(2, 0.001, 7, :m, :T), PG(20, 1, :s) )

For each Gibbs step, the sampler will do the following:

  1. Ask HMC to update its parameterizations of :m and :T.
  2. Ask PG to update its parameterizations of :s.
  3. After both subsamplers run, bundle up the total updates and store that as a sample.

I think probably one of the more academic membersof the Turing team (@Kai_Xu @yebai or maybe @trappmartin) could speak as to how the language fits in here, but that’s really the gist of what’s happening under the hood.

My understanding is that this is similar to how Metropolis-within-Gibbs functions, but I could be wrong here.

1 Like

My understanding is that this is similar to how Metropolis-within-Gibbs functions, but I could be wrong here.

That’s correct - PG (aka conditional SMC) can be viewed as a MH kernel that always accept (similar to slice sampling). It can get stuck (or sticky) when particle degeneracy is severe. To understand this better, try to use 2 particles for PG, and see what might happen.

1 Like

PS. I agree that the sampler name PG is not very accurate. We can consider renaming it to cSMC (i.e. conditional SMC), to match the terminology in the literature. See

Andrieu C, Doucet A, Holenstein R. Particle Markov chain Monte Carlo methods. J R Stat Soc Series B Stat Methodol. Wiley Online Library; 2010;72: 269–342.

Andrieu C, Lee A, Vihola M. Uniform Ergodicity of the Iterated Conditional SMC and Geometric Ergodicity of Particle Gibbs samplers [Internet]. arXiv [math.PR]. 2013. Available: [1312.6432] Uniform Ergodicity of the Iterated Conditional SMC and Geometric Ergodicity of Particle Gibbs samplers

3 Likes

Thanks to both of you for your explanation!

Maybe it is because the field is still relatively new and not clear cut - but I would understand something different between Particle Gibbs and conditional Sequential Monte Carlo, with conditional SMC just being part of a Particle Gibbs algorithm, and that might be the main source of confusion. I will take a look at the source code to get a better insight.