[ANN] SossMLJ.jl - Probabilistic Programming for MLJ modeling

cscherrer · September 17, 2020, 2:22pm

@dilumaluthge and I are pleased to announce the release of SossMLJ.jl, which makes it easy to use models from the Soss.jl probabilistic programming language in MLJ.jl.

Here’s a linear regression example, expressed as a special case of a generalized linear model. First,we import the necessary packages:

using Distributions
using MLJBase
using Soss
using SossMLJ
using Statistics

Next we write the model in Soss:

m = @model X, s, t begin
    p = size(X, 2) # number of features
    β ~ Normal(0, s) |> iid(p) # coefficients
    σ ~ HalfNormal(t) # dispersion
    η = X * β # linear predictor
    μ = η # `μ = g⁻¹(η) = η`
    y ~ For(eachindex(μ)) do j
        Normal(μ[j], σ) # `Yᵢ ~ Normal(mean=μᵢ, variance=σ²)`
    end
end;

Now let’s generate some synthetic features. We’ll have two continuous features and two binary categorical features.

num_rows = 1_000
x1 = randn(num_rows)
x2 = randn(num_rows)
x3 = Int.(rand(num_rows) .> 0.5)
x4 = Int.(rand(num_rows) .> 0.5)
X = (x1 = x1, x2 = x2, x3 = x3, x4 = x4)

In Soss, the arguments of a model are the values it needs to determine a joint distribution over parameters. At model definition time, Soss doesn’t consider “the variable that will later be observed” to be special in any way, so the “parameters” include this.

In the model m, the arguments are features X and hyperparameters s and t. Let’s set our hyperparameters…

hyperparams = (s=0.1, t=0.1)

Now we can express this as a SossMLJModel,

model = SossMLJModel(;
    model       = m,
    hyperparams = hyperparams,
    infer       = dynamicHMC,
    response    = :y,
);

Here infer can be any function that takes a model with arguments (called a JointDistribution in Soss) and returns samples, and response tells which variable we’ll be observing.

Note that X above is a NamedTuple (generally a Table, as expected by MLJ), but m takes a matrix. There’s another transform argument to SossMLJModel that handles this, with the default

@inline function default_transform(tbl)
    return (X = MLJModelInterface.matrix(tbl),)
end

This allows lots of flexibility in how the Soss model represents the features.

Let’s make some fake data to work with.

args = merge(model.transform(X), hyperparams)
truth = rand(m(args))

On this run, I get

julia> pairs(truth)
pairs(::NamedTuple) with 6 entries:
  :p => 4
  :η => [-0.0999251, 0.226782, -0.243349, 0.0821754, -0.281185, -0.0371795, 0.103831, -0.016029…
  :μ => [-0.0999251, 0.226782, -0.243349, 0.0821754, -0.281185, -0.0371795, 0.103831, -0.016029…
  :σ => 0.149136
  :β => [0.184801, -0.0151003, 0.135162, -0.25159]
  :y => [-0.170022, 0.0709445, -0.197551, -0.0813788, 0.195457, 0.0216646, 0.277798, -0.198569,…

Next we’ll give the model X and truth.y, and let it figure out the rest. In MLJ we fit models by creating a machine:

mach = MLJBase.machine(model, X, truth.y)
fit!(mach)

The fitresult for this model contains a post field that’s an array of named tuples. We can summarize it like this:

julia> particles(mach.fitresult.post)
(σ = 0.145 ± 0.0036,
 β = Particles{Float64,1000}[0.176 ± 0.005, -0.0169 ± 0.0049, 0.129 ± 0.0084, -0.249 ± 0.0088],)

Now for a given X (here we’ll just use the same one), we can get the posterior predictive distribution:

predictor_joint = predict_joint(mach, X)
typeof(predictor_joint)

predictor_joint is just a distribution we can sample from:

single_sample = rand(predictor_joint; response = :y)

It’s easy to evaluate the logpdf on any sample like this,

logpdf(predictor_joint, single_sample)

Instead of just one sample, it’s often useful to get particles, in the sense of MonteCarloMeasurements.jl. Also, note that here and above, we can change the response variable after fitting. So,

julia> predict_particles(mach, X; response = :β)
4-element Array{Particles{Float64,1000},1}:
  0.176 ± 0.005
 -0.0169 ± 0.0049
  0.129 ± 0.0084
 -0.249 ± 0.0088

If we know ground truth, this allow checks like

julia> truth.β - predict_particles(mach, X; response = :β)
4-element Array{Particles{Float64,1000},1}:
  0.00833 ± 0.005
  0.00182 ± 0.0049
  0.00666 ± 0.0084
 -0.00221 ± 0.0088

Or if we don’t, we can do posterior predictive checks against our observed y values. Here are the Bayesian p-values for this model:

julia> mean.(truth.y .< predict_particles(mach, X; response = :y)) |> particles
Particles{Float64,1000}
 0.498052 ± 0.285

There’s lots more we can do, check out the rest of this example.

Many thanks to @tlienart for helping me get up to speed on the MLJ interface and getting us started with a previous version of this library, and to @ablaom and the rest of the MLJ team for helpful discussions, and of course for the excellent MLJ framework.

tlienart · September 17, 2020, 3:48pm

this is awesome, well done to you both

Topic		Replies	Views
PPL connection to MLJ.jl Probabilistic Programming mlj	12	1649	September 22, 2019
Soss relating to the broader PPL ecosystem Probabilistic Programming	5	1428	October 16, 2019
MLJ probabilistic models basics General Usage question , mlj	9	1418	March 27, 2020
Current state of Julia Probabilistic Programming Languages and functionalities General Usage	9	2370	April 22, 2022
PPL without observations as function parameters Probabilistic Programming proposal	14	684	August 23, 2019

[ANN] SossMLJ.jl - Probabilistic Programming for MLJ modeling

Related topics