Feedback request: `rand` and `predict` semantics

cscherrer · June 18, 2022, 5:26pm

Hi, I’m working on Tilde.jl, and thinking some more about how rand and predict should be set up.

In current development, this works:

julia> m = @model n begin
           μ ~ Normal(σ=10)
           σ ~ Exponential()
           y ~ Normal(μ, σ) ^ n
       end;

julia> obs = (y = [3.0, 4.0],);

julia> post = m(2) | obs
ModelPosterior given
    arguments    (:n,)
    observations (:y,)
@model n begin
        μ ~ Normal(σ = 10)
        σ ~ Exponential()
        y ~ Normal(μ, σ) ^ n
    end

julia> r = rand(post)
(μ = 4.99758715428609, σ = 1.582563929898804)

julia> predict(post, r)
(y = [3.6962760705758146, 3.661799255763297],)

I think this makes sense for this example. But what if you had, say y = [3.0, missing]? I think conditioning on missing should be the same as not conditioning at all. So for arrays, conditioning on y ought to really mean conditioning on the known elements of y, so rand should sample the others.

So this would look something like

julia> rand(m(2) | (y = [3.0, missing]))
(μ = 4.99758715428609, σ = 1.582563929898804, y = [3.0, 6.346809940724782])

But it seems weird to have a completely different return type when y is fully observed. So maybe the fully-observed case should instead work like this:

julia> rand(post)
(μ = 4.99758715428609, σ = 1.582563929898804, y = [3.0, 4.0])

This is equivalent to calling rand on the model where y ~ Normal(μ, σ) ^ n is replaced with y ~ Dirac([3.0, 4.0]), or could also be seen as calling rand on the original (unconditioned) model, and updating that result with any observations.

`logdensity` / `rand` interaction

This seems fine, but we need to be sure we have consistent semantics. Since rand and logdensityof are both defined for a post, we should be able to call

logdensityof(post, rand(post))

So… maybe the semantics are that logdensityof can be called without including Dirac components, and if they are given they must match what the model specifies. Does this make sense?

`predict` / `rand` interaction

Currently we can do

julia> predict(post, rand(post))
(y = [17.638620683052217, 17.891076168944927],)

What does this look like if rand also has a y slot? I think this could be something like

Among all sampled (~) variables in a posterior model post, predict(post, pars) samples those that are either (1) not included in pars or (2) are conditioned upon in post

Thoughts?

I think it’s important to get this right, so I’d appreciate any feedback on this. Thanks

EvoArt · June 18, 2022, 6:04pm

"and if they are given they must match what the model specifies. "

Presumably this would throw an error if they don’t. It’s tempting to return -Inf in this case, but I can’t think of a real situation where people pass in the wrong y values and it’s not just an error.

cscherrer · June 18, 2022, 6:40pm

I think rather than decide what happens if they don’t match, it’s helpful to ask what rand means on a posterior.

If it means “sample form the unconditioned model, except for observed values which are treated as Dirac” then the case where the input doesn’t match that needs to return a log-density of -Inf, to match Dirac semantics.

Maybe it should mean something different, but I don’t have any alternatives yet.

marius311 · June 18, 2022, 6:50pm

My (maybe bad) 2c as a casual onlooker but I was indeed curious what rand(post) meant, not knowing for Tilde and not being able to remember for other PPLs. It almost makes the most sense to me that it would mean a sample from the posterior, since I think of the conditioned posterior as some implicit distribution, and rand samples distributions. But ofc this requires sophisticated sampling. So one thing it could be is an error unless its trivially sampleable. Not sure if that makes any of these other downstream issues irrelevant.

EvoArt · June 18, 2022, 7:05pm

I would tend to agree that calling rand() on the posterior makes me think of sampling from the posterior.

cscherrer · June 18, 2022, 7:55pm

This is a great point, makes me think we should just disallow rand on conditional measures. You can always get the prior and call rand on that if. The extra step is easy, and it makes things a lot clearer.

mschauer · June 19, 2022, 8:28am

Not so much disallow but only define it if sampling the conditional is day O(1) and then rand should give such samples

cscherrer · June 19, 2022, 1:12pm

@Tamas_Papp made a similar point a while back, and I agree. To try to nail this down a little, I think a sampling procedure should only be callable using rand if it’s

“fast” in some sense ^\dagger
exact
iid

^\dagger though O(1) might be too restrictive, for example MvNormal requires matrix-vector multiply

Topic		Replies	Views
Sampling predictive with MvNormal in Turing Probabilistic programming turing	2	189	June 7, 2024
Learning Julia by an example of a simple Bayesian linear regression Performance question , performance , distributions , bayesian-inference , linear-regression	10	1142	June 4, 2023
Setting up Turing.jl model for an astrophysics problem Probabilistic programming question	2	434	February 9, 2022
[ANN] RandomExtensions.jl - randomness for all Package Announcements	0	604	March 30, 2019
Frédéric Goualard found an inexactness in our rand() Numerics random	19	1036	September 8, 2022

Feedback request: `rand` and `predict` semantics

logdensity / rand interaction

predict / rand interaction

Thoughts?

Related topics

`logdensity` / `rand` interaction

`predict` / `rand` interaction