Turing - "Automatic" Non-Centered Distribution Types

Hi Folks,

I experimented a bit today with some “automatic” non-centered Normal distributions this morning, and I think I have something that works quite well. Many experts know about the trick to replace

mu ~ <mu prior>
sigma ~ <sigma prior>
xs ~ filldist(Normal(mu, sigma), N)
xs_observed ~ arraydist([Normal(x,s) for (x,s) in zip(xs, sigma_xs_observed)])

when the posterior on xs is dominated by the prior with the non-centered

mu ~ <mu prior>
sigma ~ <sigma prior>
xs_unit ~ filldist(Normal(0,1), N)
xs = mu .+ sigma .* xs_unit
xs_observed ~ arraydist([Normal(x,s) for (x,s) in zip(xs, sigma_xs_observed)])

to avoid the pathological “Neal’s funnel” geometry in the model and dramatically improve sampling. But this is inconvenient in a lot of ways—among them that xs are now a generated quantity and so aren’t already recorded in the chain. And: in this case it’s quite simple to see how to shift and scale the funnel away, but sometimes you want to shift and scale to a mix of the prior and the observations, and sometimes even more complicated expressions are involved, etc, etc, etc.

It would be great to have a distribution that takes care of this shift-and-scale for stability for you, and I wrote one:

using Bijectors
using ChangesOfVariables
using Distributions
using Turing

struct NonCenteredNormal{T <: Real} <: ContinuousUnivariateDistribution
    mu::T
    sigma::T
    mu_centering::T
    sigma_centering::T
end
struct NonCenteredNormalBijector{T <: Real} <: Bijector
    mu::T
    sigma::T
end

function NonCenteredNormal(mu, sigma)
    NonCenteredNormal(mu, sigma, mu, sigma)
end

function ChangesOfVariables.with_logabsdet_jacobian(b::NonCenteredNormalBijector{T}, x) where T <: Real
    y = (x-b.mu)/b.sigma
    logdetjac = -log(b.sigma)
    y, logdetjac
end
function ChangesOfVariables.with_logabsdet_jacobian(ib::Inverse{NonCenteredNormalBijector{T}}, y) where T <: Real
    x = y*ib.orig.sigma + ib.orig.mu
    logdetjac = log(ib.orig.sigma)
    x, logdetjac
end

Distributions.logpdf(d::NonCenteredNormal{T}, x::S) where {T <: Real, S <: Real} = logpdf(Normal(d.mu, d.sigma), x)
Distributions.rand(rng::AbstractRNG, d::NonCenteredNormal{T}) where T = rand(rng, Normal(d.mu, d.sigma))
Bijectors.bijector(d::NonCenteredNormal{T}) where T = NonCenteredNormalBijector(d.mu_centering, d.sigma_centering)

Now you can write such models as the above cleanly, but take advantage of the shift-and-scale transformation:

mu ~ <mu prior>
sigma ~ <sigma prior>
xs ~ filldist(NonCenteredNormal(mu, sigma), N)
xs_observed ~ arraydist([Normal(x,s) for (x,s) in zip(xs, sigma_xs_observed)])

and Turing will take care of the non-centering behind the scenes for you. Optionally, you can also use NonCenteredNormal(mu, sigma, location_to_center, scale_to_center) if you don’t have a prior-dominated posterior—this will make the sampler see parameters with scale scale_to_center around location location_to_center but apply a N(mu, sigma) prior.

Is this too specialized to put into Turing? Does everyone who knows about it just do it by hand and is happy to continue? Thoughts on the code?

Cheers,
Will

2 Likes