Glad that it helped:)
Can I assume that Turing takes care of any needed chain rule/Jacobian adjustments? (Certainly my toy case here is working like this).
In general yes, but not in the case when you do addlogprob!
. When you use addlogprob!
, Turing assumes the user knows what they’re doing and leaves the RHS as is. So μ
and σ
will internally be transformed and the logjoint will be adjusted accordingly, but in the model-body you will only see μ
and σ
in their original space despite the sampler working with the unconstrained variables, e.g. HMC. But “conditionally deterministic” variables such as q50
and q90
the sampler doesn’t care about; all that matters is that these affect the logjoint. And so there’s no need to transform q50
and q90
to unconstrained space, and thus no need to deal with jacobian adjustments to the jointdistribution.
Does this make sense?
TL;DR: jacobian adjustments are performed automatically for latent variables when the sampler requires it, e.g. HMC, but the sampler doens’t care about conditionally deterministic variables and so AFAIK there should never be a case where this is something to think about when using addlogprob!
.
which (I assume?) has to do with Turing not treating the DynamicPPL.@addlogprob! as part of the prior.
What do you mean “doesn’t work” here? You get different quantiles from the above model which is supposed to be the prior?
If so, yeah that won’t work. Prior
is a simple sampler that samples from the right-hand sides of ~
statements. As soon as you want adjust the logdensity using some other quantities, e.g. your q50
and q90
, then your back to the issue that MCMC attempts to solve:) But it’s a good point though, and maybe we should add an example in our docs talking a bit more about this.
But it’s actually possible to sample from the model prior using MCMC methods, though it’s a bit intricate. This is essentially writing the sample
from AbstractMCMC.jl by hand (see the README for a bit more information) + using some convenience methods provided by DynamicPPL.jl.
julia> using Turing
julia> rng = Random.MersenneTwister(42);
julia> y = [0.0];
julia> m = Demo(y);
julia> alg = NUTS();
julia> # `spl` is a wrapper from DynamicPPL.jl that inherits a bunch of convenient impls.
spl = DynamicPPL.Sampler(alg);
julia> spl_initial = DynamicPPL.initialsampler(spl) # Sampler used to initialize `spl`
DynamicPPL.SampleFromUniform()
julia> # Get the initial state using `spl_initial` and specify that we're only interested in
# the prior by providing the `PriorContext()`.
initial_state = DynamicPPL.VarInfo(rng, m, spl_initial, DynamicPPL.PriorContext())
julia> # Get the first `transition` and `state` for `spl`.
transition, state = DynamicPPL.initialstep(rng, m, spl, var_info);
julia> # Set up a container for the samples using `transition` as the "template" for how samples look.
samples = AbstractMCMC.samples(transition, m, spl)
Turing.Inference.HMCTransition{NamedTuple{(:μ, :σ), Tuple{Tuple{Vector{Float64}, Vector{String}}, Tuple{Vector{Float64}, Vector{String}}}}, NamedTuple{(:n_steps, :is_accept, :acceptance_rate, :log_density, :hamiltonian_energy, :hamiltonian_energy_error, :max_hamiltonian_energy_error, :tree_depth, :numerical_error, :step_size, :nom_step_size), Tuple{Int64, Bool, Float64, Float64, Float64, Float64, Float64, Int64, Bool, Float64, Float64}}, Float64}[]
julia> # Sample! Notice that we update the `state` at every iteration and pass the previous `state` to `step`.
for i = 1:10_000
sample, state = AbstractMCMC.step(rng, m, spl, state)
AbstractMCMC.save!!(samples, sample, i, m, spl)
end
julia> # Since `samples` is a simple array of samples we instead bundle those into a `MCMCChains.Chains`.
prior = AbstractMCMC.bundle_samples(samples, m, spl, state, MCMCChains.Chains)
Chains MCMC chain (10000×14×1 Array{Float64, 3}):
Iterations = 1:10000
Thinning interval = 1
Chains = 1
Samples per chain = 10000
parameters = μ, σ
internals = acceptance_rate, hamiltonian_energy, hamiltonian_energy_error, is_accept, log_density, lp, max_hamiltonian_energy_error, n_steps, nom_step_size, numerical_error, step_size, tree_depth
Summary Statistics
parameters mean std naive_se mcse ess rhat
Symbol Float64 Float64 Float64 Float64 Float64 Float64
μ 2.9987 0.0997 0.0010 0.0023 1699.0814 1.0013
σ 5.4623 0.1076 0.0011 0.0019 2807.4221 1.0005
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
μ 2.7983 2.9320 3.0014 3.0693 3.1890
σ 5.2567 5.3890 5.4608 5.5351 5.6749
julia> yhat_prior = rand.(Normal.(prior[:μ], prior[:σ]))[:];
julia> quantile(yhat_prior, 0.50) # ✓
3.018120334070474
julia> quantile(yhat_prior, 0.90) # ✓
10.157027708693064
This is really touching internals though, so not certain if you want to rely on this. This is as much for me as for you, as I wanted to see if we could provide this in a neat format
Thinking that maybe we should provide a convenient way of saying “only sample the prior, but sample it using alg
instead of Prior()
”.
EDIT: I made a issue regarding this btw: Sample from prior using inference algorithm · Issue #1591 · TuringLang/Turing.jl · GitHub