Probabilistic programming repositories

I have used Stan.jl, then Mamba.jl briefly, and found them nice for a PPL.

Then I ran into a difficult models, for which, among other things, I

  1. had to use some non-standard transformations (and adjust the log likelihood with the Jacobian determinant accordingly),
  2. could save a lot of time using sufficient statistics,
  3. could improve ESS a lot by conditioning the transformations/model structure on the data (think, eg, centered/non-centered parametrizations in a hierarchical model depending on group size, different for each group).

For 1 and 3, things got so complicated that I preferred to unit test the pieces, which I found difficult in a PPL. Also, benchmarking and profiling the model by small pieces was a lot of help in finding bottlenecks (think of likelihood evaluations that cost 1–2s with a gradient, even when coded manually, this adds up a lot).

This experience made me very skeptical of PPLs. Coding up the model as essentially a loglikelihood calculation is not the major pain for me when doing Bayesian inference, and PPLs make it a bit simpler, at the cost of making a lot of other things complicated.

For me, the quality of the backend doing the inference matters much more for nontrivial (\ge 5000 parameters) models than the surface syntax coding the model, which is essentially a fancy way of writing the likelihood. I spend much more time figuring out why I am getting bad mixing or slow sampling than thinking about likelihoods per se.

The ideal interface I am striving for these days is an API composed of Julia code that facilitates coding models. When I run into problems, this allows me to deal with them the same way as other Julia programs, using the extended set of tools kind people made available (eg ProfileView.jl and Traceur.jl, when I need to go beyond @code_typewarn).

This is very subjective, but I consider PPLs as a DSL as a way of bringing back the two-language problem. I think that this had a rationale in languages less powerful than Julia, eg Stan is great because R can’t cut it in speed and no one wants to program C++ if they can avoid it. But Julia’s powerful low-cost abstractions motivate the exploration of doing without PPLs for me.

5 Likes