I’d like to get Soss.jl connected to MLJ.jl. I can imagine a few ways this might work, and thought I could get some input here on what could be most useful as first steps.
I’ll start with a short overview, since Soss works differently than most PPLs. All of this might actually run, but I’m not testing as I go, so for now just consider as pseudocode.
Here’s how one could write a simple linear model:
m = @model (αPrior, βPrior, σPrior, xPrior) begin
α ~ αPrior
β ~ βPrior
σ ~ σPrior
x ~ xPrior
yhat = α .+ β .* x
y ~ Normal(yhat, σ)
end
In m
, (αPrior, βPrior, σPrior, xPrior)
are free variables, which for this purpose can be considered as hyperparameters.
One very different thing here is that no data are observed in the definition of a model. That’s separate, as part of inference. All the model knows how to do is reason about relationship between parameters and generate data.
Models are “function-like”, so m(αPrior, βPrior, σPrior, xPrior)
gives us a thing I’m currently calling a BoundModel
, but I’ll probably change that name at some point. It’s really more like a joint distribution. Inference methods in Soss take a joint distribution, values for a subset of the variables, and an sampling algorithm. There are different kinds of these, but for now I’ll focus on the ones that return a sample from the posterior.
A “sample” for me will be an iterator (it’s not this yet, but that’s where things are going). So for example something like
joint = m(αPrior, βPrior, σPrior, xPrior)
post = sample(joint, (x=x0,))
Soss makes it easy (or easier, anyway) to reason about models, so for example it should be easy to turn the above into something like
mPred = @model(α, β, σ, x) = begin
yhat = α .+ β .* x
return Normal(yhat, σ)
end
From there it’s just a matter of piping the inference results into this prediction.
A lot of this could change. Maybe m
should be a closure over hyperparameters with an inner model taking x
input? Lots and lots of possibilities. And for a given model, I’d probably have a macro
@mlj_supervised m x y
to set up the predictive distribution and the type MLJ methods in the right way.
@oxinabox, you had mentioned a need for this sort of thing. What’s a simple use case that would be useful to you?