Is it possible to set up a model in Turing so that the posterior distribution of some parameters satisfy a set of constraints?
For example, suppose I want to do linear regression, but from physical principles I know that the coefficient matrix (L) of my model must be positive definite, and the cross diagonal terms must be equal. E.g.,
Y ~ L * X # linear model of my process
# with variables I observed
Y = [y1, y2]
X = [x1, x2]
# and parameters I want to infer
L = [L11 L12;
L21 L22]
# subject to these constraints:
# L must be positive definite
L11 * L22 - L12 * L21 > 0 # Sylvester's criterion
# The cross diagonal terms of L must be equal
L21 == L21
Is it possible to ensure that the posterior distribution of L obey these constraints? I am also not sure what effect the prior has on the posterior, i.e. should the prior of L also satisfy the constraints?
Is it possible to ensure that the posterior distribution of L obey these constraints?
Yep; this can be done for arbitrary constraints by just adding -Inf to the logjoint, i.e. putting
if ... # you're check as to whether the constraints are valid
Turing.@addlogprob! -Inf
# Do an early return since this entire sample will
# be rejected anyways.
return nothing
end
in your model. But this is a bad idea in a lot of cases, e.g. if you’re working with continuous variables and have equality constraints (like you’re wanting to do) since these will be satisfied with probability 0
I am also not sure what effect the prior has on the posterior, i.e. should the prior of L also satisfy the constraints?
Exactly. Whenever you can encode the constraints/symmetrics into your prior, do it:) In the particular example you mention you can for example drop the equality-constraint, and instead construct your L by using L12 in both the (1, 2) and the (2, 1) entry.
If indeed the example you posted is exactly what you want to do, i.e. you want a prior on a positive-definite matrix, then you always have the LKJ distribution + sampling the diagonal-terms separately, or even Wishart with parameters ensuring full-rank.
In case anyone stumbles across this post, this worked for me:
@model function fit_onsager(x, y)
σ ~ truncated(Normal(0, 100), 0, Inf)
L ~ InverseWishart(2, Matrix{Float64}(I, 2, 2))
for i=1:size(x, 2)
y[:, i] ~ MvNormal(L * x[:, i], σ)
end
end
model = fit_onsager(X, Y)
chain = sample(model, NUTS(0.65), 3_000);
In effect, the inverse Wishart distribution satisfies all my constraints Note, the sampler sometimes fails to start, complaining about a PosDefException or a SingularException, but if you just restart it a few times it eventually works…