Current state of Julia Probabilistic Programming Languages and functionalities

Hey folks. I was looking over the various Julia probabilistic programming languages and was trying to reconcile the features of each. Unfortunately I did not get very far. But I was wondering if anyone can point me to a resource or something that explains the features and applications for the different Julia PPLs. I saw that @cscherrer had proposed a survey paper, but I am not sure if that came to fruition.

I was wondering if there really is some sort of benchmarking between these different PPLs just to help understand which algorithms perform best under different circumstances.

Any help would be appreciated. Here is a list of PPLs that I know of, but there are more. This is the list from Chad Scherrer with some addtitions.

8 Likes

Also interested in the answer! For the sake of completeness, I think AbstractPPL also belongs here

2 Likes

Hi @00krishna , I’m happy to give an overview of Soss and Tilde.

Soss

The idea with Soss was to start with more tractable problems, and generalize as we understand the space better. So it has some restrictions

  • A model must be a collection of top-level “statements”, each of which is an assignment (=) or a sample (~).
  • No mutation is allowed at the model level
  • The left side of each = or ~ must be a symbol. E.g. x[3] ~ Normal() is not allowed

Unlike the left side, the right side of = can be an arbitrary Julia expression, and the right side of ~ can be an arbitrary measure. One nice thing about this is, a Soss model with specified arguments is a measure, so you can nest these however you like.

As I understand, Turing has a strong preference to flatten the namespace. Soss doesn’t do this; nested models result in a nested namespace, manipulated using NestedTuples.jl.

A fundamental idea of Soss and Tilde is that models are represented syntactically. We keep the model close to what the user writes, so it can be mapped into different inference functions very flexibly. Soss knows about dependencies of the model statements, so you can very easy manipulate the underlying DAG.

Soss is designed to be very hackable. To build an inference primitive, you specify what = and ~ map to, and then what code goes around the result. It’s all based on code generation, and makes extensive use of GeneralizedGenerated.jl in particular, among other very nice @thautwarm packages. Taine Zhao also made significant contributions to the core design.

MeasureTheory

Soss originally used Distributions.jl. As I got further with this, I came to realize that PPL design really isn’t the target use for Distributions. There aren’t many ways to build new distributions from components, and there’s often a significant cost to construction. This is fine for many use cases, but Bayesian inference usually has a construction step in the middle of a hot loop. I don’t use Distributions much, so this may have changed.

This led me to start work on MeasureTheory.jl. Despite the name (Measures.jl was unfortunately taken) it’s not really about theory. What it’s really about it composability and computational efficiency. One big part of this has been to keep Soss relatively simple, and push complexity into the measures you draw from. So for example, in Soss you can sample a random walk as

walk ~ Chain(Normal()) do x
    Normal(x, 1)
end

Chain(...) returns a Markov chain, which is a measure over sequences. If you call rand on this, you get a reproducible infinite sequence with an iterator interface. Or you can come up with a sequence and evaluate its log-density for a given chain. The transition kernel for a Chain can even be a Soss model.

MeasureTheory has benefitted from lots of discussions with @mschauer . For more on MeasureTheory.jl, check out our paper.

Tilde

For quite a while, I’ve been thinking of extending Soss to allow for more flexibility. Orginally this was going to be a new ASTModel construction, with the current Soss.Model changed to DAGModel. But the further I got into it, the more it started to feel like a separate thing. This led me to split it into its own package.

The dream goal of Tilde is to be a universal front end for PPL. The body of a model can be arbitrary Julia code. This is stored as an AST (a Julia Expr) so you can map it into anything. And for most use cases, things can be much simpler, for example defining rand on Tilde models is just a few lines of code.

You can find some details on the design of Tilde in this blog post.

8 Likes

Hi @00krishna . There is a little bit of overview about ReactiveMP (which uses GraphPPL for model specification).

In our cases we are mostly interested in creating the factor graph representation of a probabilistic model. The main purpose then is to run message passing-based Bayesian inference, we do not use sampling.

not DAG

Unlike other PPL (as far as I know) we are not concerned neither with loops the factor graph nor with direction. Our models are not DAGs. For example in ReactiveMP you can do something like:

@model function my_model()
     x ~ Normal(0.0, 1.0)
     x ~ Normal(0.0, 1.0) # This is a perfectly valid factor graph and has a proper interpretation
     ...
end

Message passing-based inference can run inference in this case and this form has a proper interpretation. ~ expression does not really mean sampled from (but tries to look like it), but simply creates a functional relationship between random variables in the form of an edge in a factor graph. We do not use this feature much though.

On the other hand, it applies some extra limitations to the model specification, e.g. we cannot use random variables within square brackets expressions as we treat random variables as … well… random variables and not numbers, as how for example Turing does. For example this is not possible:

@model function my_model()
    x ~ Normal(0.0, 1.0)
    y ~ Normal(0.0, 1.0)
    # `x` and `y` cannot be used within indexing expression
    # this expression has no proper interpretation in a form of a factor graph
    z ~ MvNormal([ x, y ], diageye(2)) 
end

datavars

We also use a different way of passing observation in the model:

@model function my_model()
    y = datavar(Float64, n) # `y` accepts an array of length `n` with type `Float64`
end

The reason is that we treat data variables as reactive inputs, model accepts an infinite data stream and continously updates posterior marginals of random variables in the model as soon as data arrives (and if it has time to do that). The whole API is designed about reactivity. Posterior marginals are exposed as streams, Bethe Free Energy computation is a stream of floats, etc. So in our framework it is possible, for example, to subscribe on posterior marginal and redirect it into a stream of prior for the next time step and perform online filtering procedure with potentially infinite data stream.

reactivity

Most of the other PPL packages are concerned with just evaluating p(x|y) given some static dataset y. ReactiveMP API (and PPL) is designed for different purposes. We treat y as an infinite data stream in a form of the observable (from Rocket.jl). We may update our data with new observations and, as a result, posterior marginals will also react and update themselves.

for loops

We are mostly interested in state-space models, so in ReactiveMP it is very common to see for loops:

@model function my_model(n)
    x = randomvar(n)
    y = datavar(Float64, n)
 
    ...    

    for i in 1:n
         y[i] ~ Normal(x[i], 1.0)
    end
    
    ...

end

form and factorisation constraints

Another feature of our language is that it supports extra inference constraints specification for random variables (e.g. functional form constraints, on strategy to compute posterior) and factorisation constraints specification for factor nodes, which to my knowledge does not do any of the packages. We use where { options... } syntax for that to inject some extra inference specific context.

For example:

@model function my_model()
    # `x` will be approximated as a point mass, which resembles EM procedure
    x = randomvar() where { marginal_form_constraint = PointMassFormConstraint() }
    # or 
    y ~ Normal(x, v) where { q = MeanField() } # or q = q(x, y)q(v) for structured VMP
end

We plan to split though form and factorisation constraints language from model specification language in the next big release. It should look smth like:

constraints = @constraints begin
    q(x) :: PointMass

    q(x, y, v) = q(x, y)q(v)
end

Don’t hesitate to ask more questions! This description is rather small and probably does not cover all aspects of our specification language. Somewhat more information can be found here: Model Specification · ReactiveMP.jl

Best,
Dmitry.

7 Likes

@cscherrer @bvdmitri Thanks you both for the really excellent detail on both packages. This is really helpful. I understand that it takes time to write such thoughtful responses, so double thanks. I have been talking to some package developers about PPL choices, and your responses have helped them to make some decisions about that. So that is great. I have some follow up questions, but I want to read your posts in more detail a few times before asking.

1 Like

I can comment on Jaynes – it’s a modeling language for Gen which uses generated functions to try and support speculative optimizations/static analysis.

Thus, explaining Gen will help understanding of how Jaynes works, etc.

Mostly, I’ve abandoned working on it in favor of working on Gen proper – due to performance issues with generated functions.

Sorry to hear it hasn’t worked out so cleanly. I’m very curious what performance issues you’ve seen, especially since I use generated functions a lot.

So far I haven’t gotten into stochastic control flow, since that seems much trickier to do well. But I’ve also wondered if there could come opportunities to hook into JET for static analysis, or maybe even the Julia compiler directly. I’ve been waiting for some of this to mature a little more before digging in, but I’d love to hear your take.

I’m wondering if there’s one PPL package that allows downstream developers to generate “model” based on custom struct.

Basically I’m asking which package DOES NOT rely on macro because macro is not composable.

2 Likes

DynamicHMC.jl?
I don’t know what you call a PPL, but it and the surrounding ecosystem are focused on using ordinary Julia code to define your models (and then using HMC for inference).

1 Like

basically my current situation is given a custom struct (constructed from some user config file and data), I can manually write a Turing.@model and I can describe to you programmatically how to write it.

The problem of course is that I can’t programmatically use a macro.

Essentially I know what the prior should be for each input, I’m relying on Turing to construct the likelihood for me