@dilumaluthge and I are pleased to announce the release of SossMLJ.jl, which makes it easy to use models from the Soss.jl probabilistic programming language in MLJ.jl.
Here’s a linear regression example, expressed as a special case of a generalized linear model. First,we import the necessary packages:
using Distributions
using MLJBase
using Soss
using SossMLJ
using Statistics
Next we write the model in Soss:
m = @model X, s, t begin
p = size(X, 2) # number of features
β ~ Normal(0, s) |> iid(p) # coefficients
σ ~ HalfNormal(t) # dispersion
η = X * β # linear predictor
μ = η # `μ = g⁻¹(η) = η`
y ~ For(eachindex(μ)) do j
Normal(μ[j], σ) # `Yᵢ ~ Normal(mean=μᵢ, variance=σ²)`
end
end;
Now let’s generate some synthetic features. We’ll have two continuous features and two binary categorical features.
num_rows = 1_000
x1 = randn(num_rows)
x2 = randn(num_rows)
x3 = Int.(rand(num_rows) .> 0.5)
x4 = Int.(rand(num_rows) .> 0.5)
X = (x1 = x1, x2 = x2, x3 = x3, x4 = x4)
In Soss, the arguments of a model are the values it needs to determine a joint distribution over parameters. At model definition time, Soss doesn’t consider “the variable that will later be observed” to be special in any way, so the “parameters” include this.
In the model m
, the arguments are features X
and hyperparameters s
and t
. Let’s set our hyperparameters…
hyperparams = (s=0.1, t=0.1)
Now we can express this as a SossMLJModel
,
model = SossMLJModel(;
model = m,
hyperparams = hyperparams,
infer = dynamicHMC,
response = :y,
);
Here infer
can be any function that takes a model with arguments (called a JointDistribution
in Soss) and returns samples, and response
tells which variable we’ll be observing.
Note that X
above is a NamedTuple
(generally a Table
, as expected by MLJ), but m
takes a matrix. There’s another transform
argument to SossMLJModel
that handles this, with the default
@inline function default_transform(tbl)
return (X = MLJModelInterface.matrix(tbl),)
end
This allows lots of flexibility in how the Soss model represents the features.
Let’s make some fake data to work with.
args = merge(model.transform(X), hyperparams)
truth = rand(m(args))
On this run, I get
julia> pairs(truth)
pairs(::NamedTuple) with 6 entries:
:p => 4
:η => [-0.0999251, 0.226782, -0.243349, 0.0821754, -0.281185, -0.0371795, 0.103831, -0.016029…
:μ => [-0.0999251, 0.226782, -0.243349, 0.0821754, -0.281185, -0.0371795, 0.103831, -0.016029…
:σ => 0.149136
:β => [0.184801, -0.0151003, 0.135162, -0.25159]
:y => [-0.170022, 0.0709445, -0.197551, -0.0813788, 0.195457, 0.0216646, 0.277798, -0.198569,…
Next we’ll give the model X
and truth.y
, and let it figure out the rest. In MLJ we fit models by creating a machine
:
mach = MLJBase.machine(model, X, truth.y)
fit!(mach)
The fitresult
for this model contains a post
field that’s an array of named tuples. We can summarize it like this:
julia> particles(mach.fitresult.post)
(σ = 0.145 ± 0.0036,
β = Particles{Float64,1000}[0.176 ± 0.005, -0.0169 ± 0.0049, 0.129 ± 0.0084, -0.249 ± 0.0088],)
Now for a given X
(here we’ll just use the same one), we can get the posterior predictive distribution:
predictor_joint = predict_joint(mach, X)
typeof(predictor_joint)
predictor_joint
is just a distribution we can sample from:
single_sample = rand(predictor_joint; response = :y)
It’s easy to evaluate the logpdf
on any sample like this,
logpdf(predictor_joint, single_sample)
Instead of just one sample, it’s often useful to get particles
, in the sense of MonteCarloMeasurements.jl. Also, note that here and above, we can change the response
variable after fitting. So,
julia> predict_particles(mach, X; response = :β)
4-element Array{Particles{Float64,1000},1}:
0.176 ± 0.005
-0.0169 ± 0.0049
0.129 ± 0.0084
-0.249 ± 0.0088
If we know ground truth, this allow checks like
julia> truth.β - predict_particles(mach, X; response = :β)
4-element Array{Particles{Float64,1000},1}:
0.00833 ± 0.005
0.00182 ± 0.0049
0.00666 ± 0.0084
-0.00221 ± 0.0088
Or if we don’t, we can do posterior predictive checks against our observed y
values. Here are the Bayesian p-values for this model:
julia> mean.(truth.y .< predict_particles(mach, X; response = :y)) |> particles
Particles{Float64,1000}
0.498052 ± 0.285
There’s lots more we can do, check out the rest of this example.
Many thanks to @tlienart for helping me get up to speed on the MLJ interface and getting us started with a previous version of this library, and to @ablaom and the rest of the MLJ team for helpful discussions, and of course for the excellent MLJ framework.