ReactiveMP: How to run linear model with multiple predictors and an intercept

Hi there

I would like to adapt the linear regression example from the docs to work with multiple predictor variables. So far, I have the following code, which runs.

using GraphPPL, Rocket, ReactiveMP, LinearAlgebra

n = 250
m = 100
@model function linear_regression(n,m)

    a ~ MvNormalMeanCovariance(zeros(m), diagm(ones(m)))
    x = datavar(Matrix{Float64})
    y = datavar(Vector{Float64})

    y ~ MvNormalMeanCovariance(x*a , tiny .*diagm(ones(n)))

    return a, y
end
results = inference(
    model = Model(linear_regression, n,m),
    data  = (y = randn(n), x = randn(n,m)),
    returnvars   = (a = KeepLast(),),
    iterations   = 20
);

However, there is currently no intercept. I have tried various different re-writes that all error for different reasons. Once I have this working, I would also want to be able to estimate the variance in the likelihood function during inference, and potentially use different distributions for the likelihood. But primarily, I would really appreciate some pointers on incorporating an intercept into the above model.

Thanks in advance.

This works (runs at least, haven’t tested thoroughly)

@model function linear_regression(n,m)
    a ~ MvNormalMeanCovariance(zeros(m), diagm(ones(m)))
    b ~ NormalMeanVariance(0.0,1.0)
    c ~ ones(n)*b
    x = datavar(Matrix{Float64})
    y = datavar(Vector{Float64})
    z ~ x*a+c
    y ~ MvNormalMeanCovariance(z , tiny .*diagm(ones(n)))

    return a, y
end
results = inference(
    model = Model(linear_regression, n,m),
    data  = (y = randn(n), x = randn(n,m)),
    returnvars   = (a = KeepLast(),),
    iterations   = 20
);

Guess I just need to get my head around the datavar/randomvar as vectors/matrices vs vectors/matrices of datavars/randomvars, and nrush up on very basic linear algebra!

But trying to infer the variance

@model [ default_factorisation = MeanField() ] function linear_regression(n,m)
    a ~ MvNormalMeanCovariance(zeros(m), diagm(ones(m)))
    b ~ NormalMeanVariance(0.0,1.0)
    σ ~  GammaShapeRate(1.0, 1.0)
    c ~ ones(n)*b
    x = datavar(Matrix{Float64})
    y = datavar(Vector{Float64})
    z ~ x*a+c
    y ~ MvNormalMeanCovariance(z ,σ .* diagm(ones(n)))

    return a, y
end
results = inference(
    model = Model(linear_regression, n,m),
    data  = (y = randn(n), x = randn(n,m)),
    returnvars   = (a = KeepLast(),),
    iterations   = 20
);

I get

ERROR: MethodError: no method matching make_node(::Type{MvNormalMeanCovariance}, ::FactorNodeCreationOptions{MeanField, Nothing, Nothing}, ::AutoVar, ::RandomVariable, ::Matrix{RandomVariable})
Closest candidates are:
  make_node(::FactorGraphModel, ::FactorNodeCreationOptions, ::Any, ::Any...) at C:\Users\arn203\.julia\packages\ReactiveMP\PPQkO\src\model.jl:428
  make_node(::Type{<:NormalMixture{N}}, ::F, ::M, ::P) where {N, F, M, P} at C:\Users\arn203\.julia\packages\ReactiveMP\PPQkO\src\nodes\normal_mixture.jl:266
  make_node(::Type{<:GammaMixture{N}}, ::F, ::M, ::P) where {N, F, M, P} at C:\Users\arn203\.julia\packages\ReactiveMP\PPQkO\src\nodes\gamma_mixture.jl:208
  ...
Stacktrace:
 [1] make_node(::FactorGraphModel, ::FactorNodeCreationOptions{Nothing, Nothing, Nothing}, ::Type, ::AutoVar, ::RandomVariable, ::Matrix{RandomVariable})
   @ ReactiveMP C:\Users\arn203\.julia\packages\ReactiveMP\PPQkO\src\model.jl:429
 [2] macro expansion
   @ .\REPL[736]:11 [inlined]
 [3] macro expansion
   @ C:\Users\arn203\.julia\packages\GraphPPL\cjw4Q\src\model.jl:390 [inlined]
 [4] create_model(::Type{linear_regression}, linear_regressionconstraints_in#4290::Nothing, linear_regressionmeta_in#4292::Nothing, linear_regressionoptions_in#4294::NamedTuple{(), Tuple{}}, n::Int64, m::Int64)
   @ Main C:\Users\arn203\.julia\packages\GraphPPL\cjw4Q\src\backends\reactivemp.jl:73
 [5] create_model
   @ C:\Users\arn203\.julia\packages\ReactiveMP\PPQkO\src\model.jl:48 [inlined]
 [6] inference(; model::ReactiveMP.ModelGenerator{linear_regression, Tuple{Int64, Int64}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Nothing, Nothing, Nothing}, data::NamedTuple{(:y, :x), Tuple{Vector{Float64}, Matrix{Float64}}}, initmarginals::Nothing, initmessages::Nothing, constraints::Nothing, meta::Nothing, options::NamedTuple{(), Tuple{}}, returnvars::NamedTuple{(:a,), Tuple{KeepLast}}, iterations::Int64, free_energy::Bool, free_energy_diagnostics::Tuple{BetheFreeEnergyCheckNaNs, BetheFreeEnergyCheckInfs}, showprogress::Bool, callbacks::Nothing, warn::Bool)
   @ ReactiveMP C:\Users\arn203\.julia\packages\ReactiveMP\PPQkO\src\inference.jl:319
 [7] top-level scope
   @ REPL[737]:1

Hi there,

I’d suggest reading through Albert’s response

So in line with that, I would try:

  • add init messages for each separate node and init marginals (you can remove them later)
  • take out the broadcast of sigma and create an auxiliary variable (as per Albert’s response), where you provide explicit factorization via constraint
  • look for inspiration in the test suite

But it’s all a guesswork from me, as I haven’t tried to build MvNormal. I’m not sure how the factorization between location and covariance in McNormal works in ReactiveMP, because you should use Cholesky there…

1 Like

Thank you, I will dig into these. To be honest I’m not tied to MvNormal likelihood. In fact I’d like to move away from using it. But it was the first way I managed to incorporate multiple predictors in the model.

Hi @EvoArt!

Thanks for trying out ReactiveMP.jl.

You are trying to impose a prior on the diagonal elements of the covariance matrix of your likelihood function.
We’ve just created a PR that will support this setup (although you’d need to use precision parameter instead of variance and MvNormalMeanPrecision likelihood instead of MvNormalMeanCovariance).

To run your model with the current master branch of ReactiveMP.jl you can impose an InverseWishart distribution on the matrix of your likelihood function. I understand you don’t want to have additional correlations in your observations, so our solution is not 100% match your intentions:

using GraphPPL, Rocket, ReactiveMP, LinearAlgebra

n = 250
m = 100

@model [ default_factorisation = MeanField() ] function linear_regression(n,m)
    a ~ MvNormalMeanCovariance(zeros(m), diagm(ones(m)))
    b ~ NormalMeanVariance(0.0,1.0)
    W ~ InverseWishart(n+2, diageye(n))
    c ~ ones(n)*b
    x = datavar(Matrix{Float64})
    y = datavar(Vector{Float64})
    z ~ x*a+c
    y ~ MvNormalMeanCovariance(z, W)

end

results = inference(
    model = Model(linear_regression, n,m),
    data  = (y = randn(n), x = randn(n,m)),
    initmarginals = (W = InverseWishart(n+2, diageye(n)), ),
    returnvars   = (a = KeepLast(), b = KeepLast(), W = KeepLast()),
    free_energy = true,
    iterations   = 10
);

using Plots
plot(results.free_energy) # check convergence

Note that you need to initialize marginals for the W (initmarginals = (W = InverseWishart(n+2, diageye(n)), ),) as some part of your computational graph will perform VMP. If you have further questions, we will be happy to help!

2 Likes

Thanks so much for getting back to me. That’s working nicely. This really is a great package, and it’s really nice to to see Bayesian inference at speed!

Couple of quick questions:
Is there a way I could implement the same model (multiple predictors, add intercept, infer variance (or precision)), but just using a Normal likelihood and a for loop, instead of MvNormal?

Also, I have taken courses in Bayesian inference (more applied than theoretical) but it is far from my actual field of expertise. And I fear I won’t fine ReactiveMP intuitive or any limitations obvious without a better understanding of what’s going on under the hood. Would you be able to recommend and accessible-ish reading or viewing materials that would give me some intuition about the inference algorithms used, and what will/won’t be (even theoretically) possible?

I can answer your first question if by predictors you mean regressors (inputs).

This is a simple curve fitting that leverages splines:

using ReactiveMP, GraphPPL, Rocket
using Distributions,Random,Plots,StatsPlots
using Splines2

# Generate data
time_max=365
noise_scale=0.05

time_index=collect(1:time_max)/time_max

p=180/time_max
growth=2
offset=0
y=offset .+ sin.(time_index*2π/p) .+ growth.*time_index .+rand(Normal(0,noise_scale),time_max)
plot(y)

# Prepare spline bases
X=Splines2.bs(time_index,df=10,boundary_knots=(-0.01,1.01));

# Define model
@model function linreg(X,n,dim_x)

    T=Float64
    
    y = datavar(T, n)
    aux = randomvar(n)

    sigma  ~ GammaShapeRate(1.0, 1.0)
    intercept ~ NormalMeanVariance(0.0, 2.0)
    beta  ~ MvNormalMeanPrecision(zeros(dim_x), diageye(dim_x))
    
    for i in 1:n
        aux[i] ~ intercept + dot(X[i,:], beta)
        y[i] ~ NormalMeanPrecision(aux[i], sigma) 
    end

    return beta,aux,y
end

constraints = @constraints begin 
    q(aux, sigma) = q(aux)q(sigma)
end

# Run inference
@time results = inference(
    model = Model(linreg,X,size(X)...),
    data  = (y = y,),
    constraints = constraints,
    initmessages = (intercept = vague(NormalMeanVariance),),
    initmarginals = (sigma = GammaShapeRate(1.0, 1.0),),
    returnvars   = (sigma = KeepLast(),beta = KeepLast(), aux = KeepLast()),
    iterations   = 20,
    warn = true,
    free_energy=true
)
# Check ELBO
plot(results.free_energy,label="ELBO")

# Plot fit 
plot(mean.(results.posteriors[:aux]), ribbon = (results.posteriors[:sigma]|>mean|>inv|>sqrt),label="Fitted")
plot!(y,label="Observed data")

If you see some divergence at the beginning of your series, you have to increase your iterations (see related issue here: Fit at the beginning of an array takes 20x more iterations than the rest · Issue #146 · biaslab/ReactiveMP.jl · GitHub)

If you have multiple observed series, you might be able to introduce extra dimension in datavar and loop through it: docs

3 Likes

Fantastic. Thanks.

Is there a way I could implement the same model (multiple predictors, add intercept, infer variance (or precision)), but just using a Normal likelihood and a for loop, instead of MvNormal?

I think the original model where you use x = datavar(Matrix{Float64}) is quite clever, but it would help if you write down the model you are trying to solve.

And I fear I won’t fine ReactiveMP intuitive or any limitations obvious without a better understanding of what’s going on under the hood. Would you be able to recommend and accessible-ish reading or viewing materials that would give me some intuition about the inference algorithms used, and what will/won’t be (even theoretically) possible?

Indeed, we are working very hard to improve the general usability and give more informative error messages for ReactiveMP.jl. Currently, the ReactiveMP.jl provides use a very sophisticated tool to run Bayesian inference, but you need to have a better understanding of message-passing algorithms in general. There is a list of literature you will certainly find useful:

This is the thesis of one of our PhD students (currently associate professor in TU/e), it describes previous iteration of message-passing toolbox, which is called ForneyLab.jl. ReactiveMP.jl is the next iteration of ForneyLab.jl and differs from implementation point of view, but ideas are exactly the same.

Other useful material:

1 Like