[ANN]: RxInfer.jl 2.0 Julia package for automated Bayesian inference on a factor graph with reactive message passing

Thanks for your comments guys!

What’s confusing me is probably that

x[i] ~ MvNormal(μ = A * x_prev, Σ = Q) 

looks like x[i] is sampled from a normal distribution with covariance Q, but in fact, the covariance of x[i] is \Sigma_i from the Lyapunov iteration \Sigma_i = A \Sigma_{i-1} A^T + Q.

In either case, I now understand how your modeling language works. Keep up the nice work! :slight_smile:

3 Likes

Looks great. Can you share more about where it should excel and where sampling based methods should be preferred? Are there a guidelines for that?

Hi @RoyiAvital!

Great question! Indeed there are no written guidelines for that at the moment.

Generally, if you can exploit conjugacy (or conditional-conjugacy) in your model or if there are pre-defined nodes that serve your needs, for example, Gamma Mixture node or GCV node, then you will excel by using RxInfer.jl in terms of accuracy and speed!

Most of these nodes representing distributions can be easily combined, e.g., Gamma Mixture can serve as a prior for the precision parameter of Normal distribution.

However, we recognize that sometimes you may want to put Beta prior on top of the mean of the Normal distribution. In this case, we advise using the CVI algorithm that uses sampling because there is no analytical solution. Note the CVI approximation will be employed locally; the rest of the inference will try to use analytical updates (if possible). Advanced users, can implement the local more efficient update for this scenario.

In a situation where your model consists only of “non-standard” dependencies, which can be solved only through sampling, you can, in principle, run inference by solely using CVI or Unscented approximations. Though, CVI will not be fast.

We will add more examples highlighting the advantage of the RxInfer.jl approach.

If you have more concrete questions, we will be happy to answer them.

1 Like

So if I get you write there are 3 cases:

  1. Model based on Conjugacy
    In this case RxInfer.jl will be (Almost?) as efficient as using the explicit formulas for the conjugate priors. But if one wants, probably one can do it manually.
  2. Model without Any Conjugacy
    In this case sampling based methods are the way to go.
  3. Mixed Model
    In this case RxInfer.jl also should have advantage as it can use the local cases where conjugacy can be exploited.

In case (3) the question is how efficient are the sampling mechanism in RxInfer.jl. If they are as efficient as competitions it will be uniformly the best choice in this case. If not, it will depend on the specific case and balance between the cases.

It’s pretty hard to provide a general recipe on what to do even within these 3 cases.

  1. Model based on Congugacy:

So, even if your model exhibits conjugacy, the inference might be problematic. So manual derivation of update rules will not be as easy as it may seem. Under the hood, RxInfer.jl casts your probabilistic model into the computational graph, Forney-style Factor Graph (FFG). FFG has a one-to-one correspondence with your probabilistic model. If there are loops in your graph, you would typically need a schedule for updates, even in conjugate cases.
The unique feature of RxInfer is that it is based on a reactive programming framework: there is no fixed inference algorithm. The model is broken into a network of factors (nodes), and each node independently processes incoming data streams.
It lets you iterate fast through your model and inference and frees you from manual derivations of the inference algorithm.

  1. Model without Any Conjugacy:

Generally, yes; however, if you can get around with Unscented transform or Linearization and you crave speed, then we advise using these techniques. Despite the naming, CVI approximation can handle many non-conjugate scenarios, but this method requires more examples. We plan to add more advanced and faster approximations to deal with different intractabilities.

  1. Mixed Model

Let me put it straight, RxInfer.jl does not have a sole sampling-based inference mechanism like Turing.jl, e.g. HMC, SMC, etc. We will add this feature later, probably we will just integrate Turing.jl into our package. However, RxInfer.jl has sampling elements in the CVI method that is used for the Delta node.

4 Likes

This looks great! I’m trying to run a multiple regression analysis based on the linear regression example but can’t figure out how to use matrix multiplication. This works:

# generate data
xdata1 = randn(10)
xdata2 = randn(10)
ydata = 2 .* xdata1 .+ 3 .* xdata2 .+ randn(10)

# define model
@model function linear_regression(n)
    x = datavar(Float64, (n, 2))
    y = datavar(Float64, n)

    # priors
    β0 ~ NormalMeanVariance(0.0, 100.0)
    β = randomvar(2)
    β .~ NormalMeanVariance(0.0, 1.0)
    
    for i in 1:n       
        y[i] ~ NormalMeanVariance(β[1] * x[i, 1] + β[2] * x[i, 2] + β0, 1.0)
        # I was expecting something like this to work:
        #  y[i] ~ NormalMeanVariance(dot(x[i, :], β) + β0, 1.0)
    end
end

# run inference
results = inference(
    model = linear_regression(length(xdata1)), 
    data  = (y = ydata, x = hcat(xdata1, xdata2)), 
    initmessages = (β = NormalMeanVariance(0.0, 1.0),
                    β0 = NormalMeanVariance(0.0, 1.0))
)

When using matrix multiplication I get MethodError: no method matching make_node(::typeof(dot). And similar error messages when I try transpose(), or '.

Thanks for the contribution, this looks great! I have two questions:

  1. Is there a public non-macro interface? If not, can we get one? :slight_smile:
  2. Would you consider renaming the package to ReactiveInfer? The “Rx” is a bit cryptic, and “Rx” is a common abbreviation for “prescription” (as in a medical prescription).
3 Likes

Hey @stanlazic !

Thank you for the interest in RxInfer. For your example to work you need to slightly modify your model and input data in the following way:

# define model
@model function linear_regression(n)
    # I changed x to be of the vector type
    x = datavar(Vector{Float64}, n)
    y = datavar(Float64, n)

    # priors
    β0 ~ NormalMeanVariance(1.0, 100.0)
    # I changed beta to be of the multivariate type
    β ~ MvNormalMeanCovariance([ 0.0, 0.0 ], [ 1.0 0.0; 0.0 1.0 ])
    
    for i in 1:n       
        y[i] ~ NormalMeanVariance(dot(x[i], β) + β0, 1.0)
    end
end

and the inference:

xdata =  collect.(zip(xdata1, xdata2)) # collect into pairs

results = inference(
    model = linear_regression(length(xdata1)), 
    data  = (y = ydata, x = xdata), 
    initmessages = (β0 = NormalMeanVariance(0.0, 1.0), )
)

and the results will be available:

mean_cov(results.posteriors[:β])
([1.1638631582255687, 2.1389045170632293], [0.24982928559123693 0.07513981828561604; 0.07513981828561604 0.21513388551908566])

The reason for your error is that dot is not defined between two vectors of RandomVariables, but rather between two multivariate RandomVariables.

You may also be interested in the following example: ReactiveMP: How to run linear model with multiple predictors and an intercept - #6 by albertpod. This example uses old version of the RxInfer, which we called ReactiveMP. But the example itself should work in any case if you remove [ default_factorisation = MeanField() ] from the @model specification and put it in the inference() function as follows:

n = 250
m = 100

@model function multivariate_linear_regression(n,m)
    a ~ MvNormalMeanCovariance(zeros(m), diagm(ones(m)))
    b ~ NormalMeanVariance(0.0,1.0)
    W ~ InverseWishart(n+2, diageye(n))
    c ~ ones(n)*b
    x = datavar(Matrix{Float64})
    y = datavar(Vector{Float64})
    z ~ x*a+c
    y ~ MvNormalMeanCovariance(z, W)

end

results = inference(
    model = multivariate_linear_regression(n,m),
    data  = (y = randn(n), x = randn(n,m)),
    initmarginals = (W = InverseWishart(n+2, diageye(n)), ),
    returnvars   = (a = KeepLast(), b = KeepLast(), W = KeepLast()),
    free_energy = true,
    iterations   = 10,
    constraints = MeanField()
)
2 Likes

Hey @CameronBieganek ! Thanks for your interest!

  1. No, there is none. The @model macro is rather old already (previously we used it with ReactiveMP.jl library directly). Recently we have prioritised plans on refactoring this part including adding a public non-macro interface for model specification (such that models can be created on the fly or loaded from a file). We cannot provide any definite time frame on when it could be ready though. So at some point you will get one :slight_smile:
  2. The reason for Rx was to associate with the Reactive eXtensions family of libraries here: ReactiveX - Languages. We considered that confusion when choosing a name for the library, but AFAIK Rx is a common abbreviation only in the US and is far less known outside of the US (I might be wrong though).
2 Likes

I got it working – many thanks for your help!

1 Like

In a model definition, how do I perform non-trivial calculations on a random variable for use as a parameter in another distribution. I’d like to use intermediate variables but it doesn’t seem possible. For example this works:

x ~ Normal(0, 1)
y ~ Normal(3 * x + 1, 1)

but this does not:

x ~ Normal(0, 1)
a = 3 * x + 1
y ~ Normal(a, 1)

I’d like to use the second form as, unlike in the simple case above, I’m not sure how to express my calculation in a closed form without introducing intermediary variables.

Thank you in advance!

Hey @Mark_Ettinger ,

thanks for your interest. You can use the second form if you replace = with ~:

x ~ Normal(0, 1)
a ~ 3 * x + 1
y ~ Normal(a, 1)

Unlike other PPL, RxInfer uses ~ to model deterministic dependencies between random variables. The reason for this behaviour is that the model specification language needs to distinguish between probabilistic expressions and normal Julia expressions. There is no sampling involved and technically x is not a number and you cannot use it in regular julia expression, but only in probabilistic expression with ~.

As a consequence you cannot use random variables in indexing expressions or in control flow operations, as these are normal Julia expressions, like this is not allowed:

x ~ Normal(0, 1) 
if x > 0 # not allowed
...
end
2 Likes

Thanks @bvdmitri, your explanation helps a lot! But this doesn’t seem to work:

x ~ Normal(mean = 0, precision = 1)
y ~ 2 ^ x

Is exponentiation not implemented or I am doing something foolish?

(p.s. I’m not only new to RxInfer but also rather new to Julia)

@Mark_Ettinger you do everything correctly, but for arbitrary functions over distribution you need to use non-linear function approximation methods. By default RxInfer performs analytical inference only with a few operators (like, +, -, dot, etc…). If you want to use arbitrary function in your model specification I would suggest you, first, to call it somehow, for example f and create the @meta specification with a specific approximation method:

f(x) = 2 ^ x

@model function mymodel()
    ...
    x ~ Normal(mean = 0, precision = 1)
    y ~ f(x)
end

@meta function mymeta()
    f() -> Linearization
    # Other available methods are Unscented and CVI
end

result = inference(
   model = mymodel(),
   meta = mymeta(),
   ...
)

you can also apply an approximation method in place, which might be even more convenient in some situations:

y ~ f(x) where { meta = Linearization() }

RxInfer will perform analytical inference where possible and fallback to the approximation method where needed.

More examples are available in the documentation:

P.S. All arguments to f must be random variables, this y ~ f(x, 2) is not possible at the moment, but should be available in the future.

2 Likes

Thanks for this package, it looks really cool. I was wondering if there’s a way to fit models on undirected and/or loopy graphs (for example, like the first visualization in this tutorial)? If so, what’s the right way to specify those structures within a @model?

Hey @ElOceanografo,

Graphs in RxInfer are undirected (in fact we don’t even have a notion of direction in the inference data structures). There are also no problems with running inference in loopy graphs, but for that you will need to initialise messages. The inference function supports the initmessages keyword arguments precisely for this.

You can take a look at the Linear regression example, which uses loopy Belief propagation and uses the initmessages keyword.

The @model macro works in such a way that ~ simply creates a link between variables with a given factor. So you can even write something like

@model function loopy_model()
    x = randomvar()
    y = randomvar()

    # This two statements create an obvious loop
    x ~ Normal(mean = y, var = 1.0)
    y ~ Normal(mean = x, var = 1.0)

    z = datavar(Float64)
    
    z ~ Normal(mean = x + y, var = 1.0)
end

And later on the inference

result = inference(
    model = loopy_model(),
    data = (z = 1.0, ),
    # The inference will throw an error without the `initmessages`
    initmessages = (
        x = vague(NormalMeanPrecision),
    ),
)

Be aware, however, that loopy belief propagation usually requires many iterations to converge (and for some models it does not converge at all, there are no convergency guaranties for loopy belief propagation). By default, the inference function performs only 1 iteration. This can be controlled with the iterations keyword argument. You may also be interested in the convergency metric with the free_energy = true argument as follows:

result = inference(
    model = loopy_model(),
    data = (z = 1.0, ),
    initmessages = (
        x = vague(NormalMeanPrecision),
    ),
    iterations = 100,
    free_energy = true
)

In my case it gives me the following output:

Inference results:
  Posteriors       | available for (y, x)
  Free Energy:     | Real[14.6851, 13.992, 13.2988, 12.6057, 11.9126, 11.2194, 10.5263, 9.83311, 9.13997, 8.44682  …  1.75593, 1.75593, 1.75593, 1.75593, 1.75593, 1.75593, 1.75593, 1.75593, 1.75593, 1.75593]

I can see that for this example the Free Energy values did indeed converge to a stable point.

1 Like

Thank you, that is very helpful! So if I wanted to model a Gaussian Markov random field, say for example this simple one with four latent nodes (a-d) and two observations (x and y):

x
|
a---b
|   |
c---d---y

what would be the correct way to specify it in the @model function?

Hi @ElOceanografo!

To give a tiny bit of background, RxInfer.jl uses a particular type of factor graphs, which establishes the connection between the random variables. I will not go into details about the computational graph behind the package. Still, it technically means that if I want to implement your model, I need to add additional nodes (functions) in your graph representation.

For example, you can specify the graph as follows:

x
|
(f_x)
|
a---(f_a)---b
|           |
(f_a)       |
|           |   
c--------(f_cb)--d--(f_d)--y

The difference with your specification is that I explicitly show the functional dependence between your variables. The graph I showed is still incomplete, though I hope it will help you to understand what’s missing from the RxInfer.jl perspective when implementing your graph.

Now, depending on the functional form of fs (let’s say Gaussians, i.e., f_i(i, x) = N(i|x, 1.0) i in {x, a, d} and f_cb=N(d|c+b, 1.0)) you can write the model and inference in RxIner.jl as:

using RxInfer

@model function loopy_model()

    y   = datavar(Float64)
    x_0 = datavar(Float64)

    # variances are chosen arbitrary
    x ~ Normal(mean = x_0, var = 1.0)
    a ~ Normal(mean = x, var = 1.0)   # f_x
    b ~ Normal(mean = a, var = 1.0)   # f_a
    c ~ Normal(mean = a, var = 1.0)   # f_a
    d ~ Normal(mean = c+b, var = 1.0) # f_cb

    y ~ Normal(mean = d, var = 1.0)
end

result = inference(
    model = loopy_model(),
    data = (y = 1.0, x_0=1.0),
    initmessages = (
        a = vague(NormalMeanPrecision),
    ),
    iterations = 50,
    free_energy = true
)

@model macro lets you specify the model in a probabilistic way, RxInfer.jl handles the graphical equivalent of the model under the hood for fast computations.

2 Likes

So if I’m understanding the modeling language correctly, the left-hand side of each ~ is a node, and each node can only appear in one ~ statement. The right-hand side of each ~ corresponds to a factor. If a node has more than one edge coming into it, it needs to be explicitly written as a function of all the connected nodes. So a graph like this

x---a   b---y
     \ /
      d
      |
      c
      |
      z

could be specified like this:

x = datavar(Float64)
y = datavar(Float64)
z = datavar(Float64)
# v is arbitrary
a ~ Normal(x, v)
b ~ Normal(y, v)
c ~ Normal(z, v)
d ~ Normal((a + b + c) / 3, v)

Is that right?

RxInfer.jl uses Forney-style Factor Graphs (FFGs) as the graphical representation of the generative model. We use edges to represent variables and nodes for factors/functions. Indeed, you can think of the left-hand side of ~ as a variable defined by an edge (or a node in your description/ it really doesn’t matter for now). The right-hand side of ~ is indeed a factor.

If a node has more than one edge coming into it (in the case of a multi-argument function), you would need to pass all the required variables inside the function.

For example, if you have a gaussian factor node N(y|x, z), it has three edges for out y, mean x, and variance z. In your model definition, you would have to write something like this:

...
y ~ Normal(mean=x, variance=z)
...

As for the model you have specified, yes, that could be an option.