Hi, I’m working on Tilde.jl, and thinking some more about how rand
and predict
should be set up.
In current development, this works:
julia> m = @model n begin
μ ~ Normal(σ=10)
σ ~ Exponential()
y ~ Normal(μ, σ) ^ n
end;
julia> obs = (y = [3.0, 4.0],);
julia> post = m(2) | obs
ModelPosterior given
arguments (:n,)
observations (:y,)
@model n begin
μ ~ Normal(σ = 10)
σ ~ Exponential()
y ~ Normal(μ, σ) ^ n
end
julia> r = rand(post)
(μ = 4.99758715428609, σ = 1.582563929898804)
julia> predict(post, r)
(y = [3.6962760705758146, 3.661799255763297],)
I think this makes sense for this example. But what if you had, say y = [3.0, missing]
? I think conditioning on missing
should be the same as not conditioning at all. So for arrays, conditioning on y
ought to really mean conditioning on the known elements of y
, so rand
should sample the others.
So this would look something like
julia> rand(m(2) | (y = [3.0, missing]))
(μ = 4.99758715428609, σ = 1.582563929898804, y = [3.0, 6.346809940724782])
But it seems weird to have a completely different return type when y
is fully observed. So maybe the fully-observed case should instead work like this:
julia> rand(post)
(μ = 4.99758715428609, σ = 1.582563929898804, y = [3.0, 4.0])
This is equivalent to calling rand
on the model where y ~ Normal(μ, σ) ^ n
is replaced with y ~ Dirac([3.0, 4.0])
, or could also be seen as calling rand
on the original (unconditioned) model, and updating that result with any observations.
logdensity
/ rand
interaction
This seems fine, but we need to be sure we have consistent semantics. Since rand
and logdensityof
are both defined for a post
, we should be able to call
logdensityof(post, rand(post))
So… maybe the semantics are that logdensityof
can be called without including Dirac
components, and if they are given they must match what the model specifies. Does this make sense?
predict
/ rand
interaction
Currently we can do
julia> predict(post, rand(post))
(y = [17.638620683052217, 17.891076168944927],)
What does this look like if rand
also has a y
slot? I think this could be something like
Among all sampled (
~
) variables in a posterior modelpost
,predict(post, pars)
samples those that are either (1) not included inpars
or (2) are conditioned upon inpost
Thoughts?
I think it’s important to get this right, so I’d appreciate any feedback on this. Thanks