Simple linear regression with ReactiveMP - RuleMethodError?

I’ve tried to implement a simple linear regression with Normal obs. model, so effectively a combination of examples: Linear Regression and Univariate Normal Mixture

When I define both RVs for location and scale of the observed variable, I get a RuleMethodError (related probably to initmessages I pass in the inference() function? See full example at the bottom).
However, the rule seems to exist in Github rules.jl but with a q_tau keyword not m_tau that my implementation looks for (unclear why).

RuleMethodError: no method matching rule for the given arguments

Possible fix, define:

@rule NormalMeanPrecision(:μ, Marginalisation) (q_out::PointMass, m_τ::GammaShapeRate, ) = begin 
    return ...
end

Stacktrace:
  [1] rule(fform::Type, on::Type, vconstraint::Marginalisation, mnames::Type, messages::Tuple{Message{GammaShapeRate{Float64}}}, qnames::Type, marginals::Tuple{Marginal{PointMass{Float64}}}, meta::Nothing, __node::FactorNode{Type{NormalMeanPrecision}, Tuple{NodeInterface, NodeInterface, NodeInterface}, Tuple{Tuple{Int64}, Tuple{Int64, Int64}}, ReactiveMP.FactorNodeLocalMarginals{Tuple{ReactiveMP.FactorNodeLocalMarginal, ReactiveMP.FactorNodeLocalMarginal}}, Nothing, ReactiveMP.FactorNodePipeline{DefaultFunctionalDependencies, EmptyPipelineStage}})
    @ ReactiveMP ~/.julia/packages/ReactiveMP/yX58s/src/rule.jl:744
  [2] materialize!(mapping::ReactiveMP.MessageMapping{NormalMeanPrecision, DataType, Marginalisation, DataType, DataType, Nothing, FactorNode{Type{NormalMeanPrecision}, Tuple{NodeInterface, NodeInterface, NodeInterface}, Tuple{Tuple{Int64}, Tuple{Int64, Int64}}, ReactiveMP.FactorNodeLocalMarginals{Tuple{ReactiveMP.FactorNodeLocalMarginal, ReactiveMP.FactorNodeLocalMarginal}}, Nothing, ReactiveMP.FactorNodePipeline{DefaultFunctionalDependencies, EmptyPipelineStage}}}, dependencies::Tuple{Tuple{Message{GammaShapeRate{Float64}}}, Tuple{Marginal{PointMass{Float64}}}})
    @ ReactiveMP ~/.julia/packages/ReactiveMP/yX58s/src/message.jl:243
  [3] materialize!(vmessage::VariationalMessage{Tuple{ReactiveMP.MessageObservable{Message}}, Tuple{ProxyObservable{Marginal, Rocket.RecentSubjectInstance{Message{PointMass{Float64}}, Subject{Message{PointMass{Float64}}, AsapScheduler, AsapScheduler}}, Rocket.MapProxy{Message{PointMass{Float64}}, typeof(as_marginal)}}}, ReactiveMP.MessageMapping{NormalMeanPrecision, DataType, Marginalisation, DataType, DataType, Nothing, FactorNode{Type{NormalMeanPrecision}, Tuple{NodeInterface, NodeInterface, NodeInterface}, Tuple{Tuple{Int64}, Tuple{Int64, Int64}}, ReactiveMP.FactorNodeLocalMarginals{Tuple{ReactiveMP.FactorNodeLocalMarginal, ReactiveMP.FactorNodeLocalMarginal}}, Nothing, ReactiveMP.FactorNodePipeline{DefaultFunctionalDependencies, EmptyPipelineStage}}}})
    @ ReactiveMP ~/.julia/packages/ReactiveMP/yX58s/src/message.jl:162
  [4] as_message(vmessage::VariationalMessage{Tuple{ReactiveMP.MessageObservable{Message}}, Tuple{ProxyObservable{Marginal, Rocket.RecentSubjectInstance{Message{PointMass{Float64}}, Subject{Message{PointMass{Float64}}, AsapScheduler, AsapScheduler}}, Rocket.MapProxy{Message{PointMass{Float64}}, typeof(as_marginal)}}}, ReactiveMP.MessageMapping{NormalMeanPrecision, DataType, Marginalisation, DataType, DataType, Nothing, FactorNode{Type{NormalMeanPrecision}, Tuple{NodeInterface, NodeInterface, NodeInterface}, Tuple{Tuple{Int64}, Tuple{Int64, Int64}}, ReactiveMP.FactorNodeLocalMarginals{Tuple{ReactiveMP.FactorNodeLocalMarginal, ReactiveMP.FactorNodeLocalMarginal}}, Nothing, ReactiveMP.FactorNodePipeline{DefaultFunctionalDependencies, EmptyPipelineStage}}}})
    @ ReactiveMP ~/.julia/packages/ReactiveMP/yX58s/src/message.jl:170

I’ve tried to provide explicit factorization (mean field as global parameter and via where command), but it didn’t help.

When I remove sigma and b messages from the initmessages of the inference function, I get the following error:

    Variables [ a, b, sigma ] have not been updated after a single inference iteration. 
    Therefore, make sure to initialize all required marginals and messages. See `initmarginals` and `initmessages` keyword arguments for the `inference` function. 

Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33
 [2] inference(; model::ReactiveMP.ModelGenerator{linreg, Tuple{Int64}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Nothing, Nothing, Nothing}, data::NamedTuple{(:y, :time_index), Tuple{Vector{Float64}, Vector{Float64}}}, initmarginals::Nothing, initmessages::NamedTuple{(:a,), Tuple{NormalMeanVariance{Float64}}}, constraints::Nothing, meta::Nothing, options::NamedTuple{(), Tuple{}}, returnvars::NamedTuple{(:a, :b, :sigma), Tuple{KeepLast, KeepLast, KeepLast}}, iterations::Int64, free_energy::Bool, showprogress::Bool, callbacks::Nothing, warn::Bool)
   @ ReactiveMP ~/.julia/packages/ReactiveMP/yX58s/src/inference.jl:361
 [3] top-level scope
   @ In[226]:1
 [4] eval
   @ ./boot.jl:373 [inlined]
 [5] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
   @ Base ./loading.jl:1196

I feel like I’m missing something obvious because I haven’t found anything in the introductory paper or the documentation that would suggest this case is not possible.
(I don’t think I need the initmarginals as well as initmessages in this case, but I wasn’t sure what else to try)

Thank you for your help!

MRE:

# generate data
a=0.2
b=5
time_index=collect(1.:20.)
noise_scale=0.5
y=time_index .* a .+ b + noise_scale*randn(size(time_index,1))

# define model
@model function linreg(n)
    a ~ NormalMeanVariance(0.0, 10.0)
    b ~ NormalMeanVariance(0.0, 10.0)
    sigma  ~ GammaShapeRate(1.0, 1.0)
    
    time_index = datavar(Float64, n)
    y = datavar(Float64, n)

    for i in 1:n
        y[i] ~ NormalMeanPrecision(a * time_index[i] + b, sigma)
    end

    return a, b, time_index, y
end

# run Var. inference
results = inference(
    model = Model(linreg, length(time_index)),
    data  = (y = y, time_index = time_index),
    initmessages = (
        a = vague(NormalMeanVariance),
        b = vague(NormalMeanVariance),
        sigma = vague(GammaShapeRate)
        ),
    initmarginals=(a = vague(NormalMeanVariance),
        b = vague(NormalMeanVariance),
        sigma = vague(GammaShapeRate)
    ),
    returnvars   = (a = KeepLast(), b = KeepLast(),sigma=KeepLast()),
    iterations   = 20,
);

System:

> Pkg.status()
      Status `~/reactive-mp/Project.toml`
  [6e4b80f9] BenchmarkTools v1.3.1
  [31c24e10] Distributions v0.25.59
  [b3f8163a] GraphPPL v2.0.1
  [91a5bcdd] Plots v1.29.0
  [a194aa59] ReactiveMP v2.0.3
  [df971d30] Rocket v1.3.22
  [5a560754] Splines2 v0.2.1
  [860ef19b] StableRNGs v1.0.0
  [9a3f8284] Random


> versioninfo()
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin19.5.0)
  CPU: Apple M1 Pro
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, westmere)
Environment:
  JULIA_NUM_THREADS = 4

Hi @svilupp!

Thanks for trying out ReactiveMP.jl. We somehow missed your question on discourse; sorry for the long waiting reply.

Please note that assigning a Gamma prior to the precision parameter of Normal likelihood results in analytically intractable inference. In the context of message-passing, it means that you can’t execute belief propagation in your graph, which results in the first error: prefix m_ stands for the message, while q_ stands for marginal.

To circumvent this issue, we need to resort to, for example, variational inference or VMP in the context of factor graphs.

Now, there are a few ways to do that, but let me first point out that the only place where we need to factorize our model is just around NormalMeanPrecision(a * time_index[i] + b, sigma) factor. In particular, we need to use a mean-field factorization between the mean and precision, i.e.
a * time_index[i] + b and sigma.

First and the easiest way to do that is to use the default_factorisation option when specifying the model:

@model [ default_factorisation = MeanField() ] function linreg(n)
    a ~ NormalMeanVariance(0.0, 10.0)
    b ~ NormalMeanVariance(0.0, 10.0)
    sigma  ~ GammaShapeRate(1.0, 1.0)
    
    time_index = datavar(Float64, n)
    y = datavar(Float64, n)

    for i in 1:n
        y[i]   ~ NormalMeanPrecision(a * time_index[i] + b, sigma)
    end
end

In this way, ReactiveMP will use mean-field wherever is possible, which in our case is just between a * time_index[i] + b and sigma.

Then the inference will follow smoothly:

results = inference(
    model = Model(linreg, length(time_index)),
    data  = (y = y, time_index = time_index),
    initmessages = (b = vague(NormalMeanVariance),),
    initmarginals = (sigma = vague(GammaShapeRate),),
    returnvars   = (a = KeepLast(), b = KeepLast(),sigma=KeepLast()),
    iterations   = 20,
    showprogress = true,
)

About initmarginals and initmessages. Non-rigorously, you need to provide marginals when you resort to VMP. Likewise, you need to provide messages when the graph contains loops (technically there is a little more to that). Here, we need to initialize marginal for sigma and a message for either a or b.

Alternatively, you can create an auxiliary vector aux that will represent the means of your likelihood, i.e.:

@model function linreg(n)
    a ~ NormalMeanVariance(0.0, 10.0)
    b ~ NormalMeanVariance(0.0, 10.0)
    sigma  ~ GammaShapeRate(1.0, 1.0)
    
    aux = randomvar(n)
    time_index = datavar(Float64, n)
    y = datavar(Float64, n)

    for i in 1:n
        aux[i] ~ a * time_index[i] + b
        y[i]   ~ NormalMeanPrecision(aux[i], sigma)
    end
end

In this case, you would need to provide specific constraints on your posterior factorization:

constraints = @constraints begin 
    q(aux, sigma) = q(aux)q(sigma)
end

and feed them inside your inference function

results = inference(
    model = Model(linreg, length(time_index)),
    data  = (y = y, time_index = time_index),
    constraints = constraints,
    initmessages = (b = vague(NormalMeanVariance),),
    initmarginals = (sigma = vague(GammaShapeRate),),
    returnvars   = (a = KeepLast(), b = KeepLast(),sigma=KeepLast()),
    iterations   = 20,
    showprogress = true,
)

This is a good extension of the Linear regression demo; please feel free to send PR with a new demo!

2 Likes

Thank you for your reply, @albertpod !

It’s exciting to see it run! I’ve tried the MF mode but never with the right combination of initmessages/initmarginals. Your rule of thumb is helpful.

Happy to add it to the demo, but I’m not comfortable with the results / recovery of the original parameters yet.

While the a,b parameters are okay, the sigma parameter is quite off. The posterior is unexpectedly wide - true value was 0.5, the prior was Gamma(1,1) and posterior is Gamma(11,0.4) (ie, mean value is around 5 which 10x of what we need).
I’m not sure where it picked up such width. It’s different from the model prior, so some information is flowing.

Do you have any intuition why that could be?

I have tried various settings short of changing the priors and initmarginals to be more informative, which is my next step.

I cross checked the result in Numpyro to see whether 20 data points are enough with MF (simple SVI, same model, AutoDiagonalNormal guide) and that worked exactly as expected.

Thank you for your help.

Hi @svilupp!

The way you generate data is the following, right?

# generate data
a=0.2
b=5
time_index=collect(1.:20.)
noise_scale=0.5
y=time_index .* a .+ b + noise_scale*randn(size(time_index,1))

Here, the noise_scale variable corresponds to the standard deviation 0.5 (or variance 0.25).
Note that the sigma variable of your model corresponds to the precision parameter, not the variance.

As you might know, the relationship between these variables is the following: 1/variance = precision.

When you retrieve the posterior of the precision parameter and check its mean and variance (corresponding to the precision of your likelihood) you get:

msigma = mean(results.posteriors[:sigma]) # mean of the posterior
vsigma = var(results.posteriors[:sigma]) # variance of the posterior

to retrieve the estimates for noise_scale, we need to apply inversion and take a square root of precision estimates:

inv_msigma = sqrt(inv(msigma)) # estimates for noise_scale

In this way, I get inv_msigma = 0.49521700118560297!

You can try to generate the data in the following way (so it maps to your model likelihood ±one-to-one):
y = time_index .* a .+ b .+ rand(NormalMeanPrecision(0.0, noise_precision), size(time_index,1));

1 Like

Of course! Thank you! That was a silly mistake… I have spend an hour trying to adjust the priors and never thought to check the parametrization I’m using…


On a separate but related note:
Is there a reason why I cannot use NormalMeanVariance for the observed variable (ie, just swapping out the NormalMeanPrecision)?

I get the following error:

RuleMethodError: no method matching rule for the given arguments

Possible fix, define:

@rule NormalMeanVariance(:v, Marginalisation) (q_out::PointMass, q_μ::NormalWeightedMeanPrecision, ) = begin 
    return ...
end

I have tried all the various ways to explicitly define the MF factorization (after @model, via where {}, with and without aux variable)

I think I’m lacking the fundamental understanding of why \mu becomes NormalWeightedMeanPrecision distributed despite my a and b being parametrized by NormalMeanVariance on the prior. I have noticed that the same happens to a and b posteriors as well.

  • Are all Normal posteriors coerced to NormalWeightedMeanPrecision?
  • Is there a reason why NormalWeightedMeanPrecision and NormalMeanVariance don’t play nice when I’m using VMP?

I think I ended up with the NormalMeanPrecision parametrization when trying to debug the earlier error and got further with NormalWeightedMeanPrecision without really understanding why.


Would you be able to recommend any resources / heuristics on which initmarginals/initmessages we need and when?
I’d be curious to understand it a bit more. I have read the recommended introduction in your documentation (Loeliger et al. (2007))

Hi @svilupp!

If you use NormalMeanVariance as your likelihood function, you’d need to assign InverseGamma prior to the variance. At the moment, it’s just not implemented in ReactiveMP.jl. We are working on it as well as on other cool things! Keep an eye on it :wink: In theory, using NormalMeanPrecision as a likelihood function shouldn’t change the results for the posterior.

So, the error you see suggests that no variational rule is implemented for the NormalMeanVariance. It has nothing to do with q_μ being NormalWeightedMeanPrecision; ReactiveMP.jl doesn’t care about the parametrization of a Gaussian, it embraces NormalWeightedMeanPrecision, NormalMeanPrecision, NormalMeanPrecision under “umbrella” type UnivariateNormalDistributionsFamily.

NormalWeightedMeanPrecision is preferred in ReactiveMP.jl for computational reasons, it’s just faster :wink:

We recognize that the current Stacktrace is not very informative, we will improve it soon.

As for literature, Loeliger et al. (2007) is a good start, also Sascha Korl thesis (2005)

For a theoretical analysis of these graphs and MP algorithms, I would refer you to Senoz et al

For ReactiveMP.jl you can have a look at Bagaev et al (2021), but watch out - the API for inference in this paper is oldish.

P.S. If you don’t change the model, but tweak only the initmarginals, initmessages and priors of the model, you can use the free_energy score as a proxy for model comparison :wink:

inference(blablabla,
          free_energy=true)
1 Like