SMC with HMC Kernel

Is it possible to use an HMC or NUTS kernel with Turing’s SMC methods, e.g. as described here and implemented in TensorFlow Probability (which allows for using arbitrary kernels)?

1 Like

There are no easy to use implementation of Buchholz et al.'s SMC as far as I know of. So you’ll have to write it from scratch. I did implement it myself in a personal project, so I can help you’re gonna give it a shot. Is there a reason you want to use SMC for your specific problem? Vanilla NUTS as implemented in AdvancedHMC.jl should work just as well most of the time.

I actually don’t have an especially strong need for it — my thought was that I could offer it as a sensible option for refitting a PSIS-LOO approximation whenever you get k>.7.

Actually, given you’ve already written the code, the HMC+SMC implementation looks like a great thing to add to Turing — have you considered making a pull request to add it? Or, if that would be too much work, making a new package for it? Huge bonus points if you can make it work with NUTS or ChEES-HMC so that fine-tuning isn’t necessary.

Actually, given you’ve already written the code, the HMC+SMC implementation looks like a great thing to add to Turing — have you considered making a pull request to add it? Or, if that would be too much work, making a new package for it? Huge bonus points if you can make it work with NUTS or ChEES-HMC so that fine-tuning isn’t necessary.

For the record, I think SMC methods in general including Buchholz’ et al.'s methods are not as sophisticated yet compared to the MCMC camp. For example, in Buchholz et al.'s paper, the method for determining k, the number of kernel application, is quite arbitrary. I personally felt that it is overly conservative, requiring many kernel applications than one would imagine.

Also, there are these following critical limitations that need to be addressed.

  1. The use of tempered posterior means that we have to perform adaptation in every iteration under a very tight budget. Since in MCMC the target posterior is fixed, adaptation is much easier.

  2. The Stan camp has been strongly pushing this agenda, but HMC methods are useful for detecting and diagnosing broken/pathologic models. This is not the case for tempered SMC since we slowly transition from the prior to the full posterior.

In the long run, I do think SMC methods will be much more useful. However, this will not be easy to do unless we do something about the need for tempered posteriors. While in theory, there’s no problem with them, but in practice, they are a liability. At the moment, I don’t think SMC is worth considering as an alternative to MCMC yet.

1 Like