Itβs more user-friendly and itβs much easier to work with programmatically.
For the former, itβs just nice to be able to tell the model to no longer consider a variable to be an observation (you can do this using decondition
), e.g. if you want to look at the posterior predictive. The way of doing this using the arguments is to pass missing
in its place, but this sooo often goes wrong and weβve experienced users being generally quite confused about this.
When observations are passed as an argument, itβs not possible for me to query the model
itself as to what is considered an observation without execution (we have to execute the model to determine which of the arguments end up on the left-hand side of a ~
). In contrast, if you use the model | ...
or condition(model, ...)
syntax, the observations are effectively attached to the model in a way that we can then easily inspect using observations
.
For example, we are working on simplifying the inner workings of our Gibbs
sampler, where we now can simply use the condition(model, ...)
syntax to sequentially fix different variables.
With all that being said, in the example with the Kalman filter you give above, itβs very understandable that youβd want to pass it as an argument. But in this case your ys
never appear on the LHS of ~
, and Turing wonβt consider this argument as an observation.
So this has nothing to do with condition
vs. pass-obs-as-arg, but itβs just the restrictions you get when using @addlogprob!
. This, AFAIK, is the same in every other PPL, i.e. if you use the βmanual accumulation of log-probabilityβ, then you lose some of the convenience.
BUT, we can βfixβ this! If you want to make the Turing consider your ys
as an observations, for things like loo
, we can implement a βdistributionβ that effectively does what @addlogprob!
does but at the same time, since itβs a Distribution
, signals to Turing.jl that βhey, this thing on the LHS of this ~
is an observation!β.
"""
DiracWithProb(value[, logp])
A Dirac delta distribution with a single point mass at `value` and `logpdf` equal to `logp`.
This is basically the same as `Distributions.Dirac` but without
being restricted to univariate values, in addition to having a non-trivial `logpdf`.
!!! warning
The non-trivial `logpdf` makes it an improper distribution, but it can be useful
in certain scenarios when used in a model.
"""
struct DiracWithProb{T1,T2,V} <: Distributions.DiscreteDistribution{V}
value::T1
logp::T2
end
variateform(x::Real) = Distributions.Univariate
variateform(x::AbstractArray{<:Real,N}) where {N} = Distributions.ArrayLikeVariate{N}
DiracWithProb(x) = DiracWithProb(x, zero(eltype(x)))
DiracWithProb(x, logp) = DiracWithProb{typeof(x),typeof(logp),variateform(x)}(x, logp)
const UnivariateDiracWithProb{T1,T2} = DiracWithProb{T1,T2,Distributions.Univariate}
const ArrayVariateDiracWithProb{T1,T2,N} = DiracWithProb{T1,T2,Distributions.ArrayLikeVariate{N}}
Base.eltype(::Type{DiracWithProb{T}}) where {T} = T
# Some required stuff from Distributions.jl.
Distributions.insupport(d::UnivariateDiracWithProb, x::Real) = x == d.value
Distributions.insupport(d::DiracWithProb, x::AbstractArray) = x == d.value
Base.minimum(d::DiracWithProb) = d.value
Base.maximum(d::DiracWithProb) = d.value
Distributions.support(d::DiracWithProb) = (d.value,)
# Evaluation
Distributions.pdf(d::UnivariateDiracWithProb, x::Real) = insupport(d, x) ? exp(d.logp) : zero(d.logp)
Distributions.logpdf(d::UnivariateDiracWithProb, x::Real) = insupport(d, x) ? d.logp : typeof(d.logp)(-Inf)
function Distributions.pdf(d::ArrayVariateDiracWithProb{<:Any,N}, x::AbstractArray{<:Real,N}) where {N}
return insupport(d, x) ? exp(d.logp) : zero(d.logp)
end
function Distributions.logpdf(d::ArrayVariateDiracWithProb{<:Any,N}, x::AbstractArray{<:Real,N}) where {N}
return insupport(d, x) ? d.logp : typeof(d.logp)(-Inf)
end
# Sampling
Base.rand(rng::Random.AbstractRNG, d::UnivariateDiracWithProb) = d.value
Base.rand(rng::Random.AbstractRNG, d::ArrayVariateDiracWithProb) = d.value
Equipped with this, we can do stuff like
julia> @model function demo(x, y)
s ~ InverseGamma(2, 3)
m ~ Normal(0, βs)
x ~ DiracWithProb(x, logpdf(Normal(m, βs), x))
y ~ DiracWithProb(y, logpdf(Normal(m, βs), y))
end
demo (generic function with 4 methods)
julia> model = demo(1.5, 2);
julia> chain = sample(model, NUTS(), 1000)
β Info: Found initial step size
β Ο΅ = 0.8
Sampling 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| Time: 0:00:00
Chains MCMC chain (1000Γ14Γ1 Array{Float64, 3}):
Iterations = 501:1:1500
Number of chains = 1
Samples per chain = 1000
Wall duration = 0.08 seconds
Compute duration = 0.08 seconds
parameters = s, m
internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ess_per_sec
Symbol Float64 Float64 Float64 Float64 Float64 Float64 Float64
s 1.9040 1.4629 0.0561 721.9248 664.9316 0.9992 9499.0100
m 1.1291 0.7297 0.0314 556.9236 473.0818 1.0012 7327.9420
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
s 0.5666 1.0414 1.4280 2.2449 6.0441
m -0.4221 0.7014 1.1622 1.5815 2.5344
julia> pointwise_loglikelihoods(model, MCMCChains.get_sections(chain, :parameters))
Dict{String, Matrix{Float64}} with 2 entries:
"x" => [-1.11969; -0.859207; β¦ ; -1.1811; -1.15306;;]
"y" => [-1.20779; -1.31884; β¦ ; -0.877885; -0.944768;;]
which will allow it to work with loo
, etc.