Constrained parameters in Turing

Hi all,

Is it possible to set up a model in Turing so that the posterior distribution of some parameters satisfy a set of constraints?

For example, suppose I want to do linear regression, but from physical principles I know that the coefficient matrix (L) of my model must be positive definite, and the cross diagonal terms must be equal. E.g.,

Y ~ L * X # linear model of my process

# with variables I observed
Y = [y1, y2] 
X = [x1, x2]
# and parameters I want to infer
L = [L11 L12;
     L21 L22]

# subject to these constraints:
# L must be positive definite
L11 * L22 - L12 * L21 > 0 # Sylvester's criterion
# The cross diagonal terms of L must be equal
L21 == L21

Is it possible to ensure that the posterior distribution of L obey these constraints? I am also not sure what effect the prior has on the posterior, i.e. should the prior of L also satisfy the constraints?

Looks pretty much like a Cholesky factorization you’ll need.

Take a look here: DistributionsAD.jl/common.jl at fe2070012167b78c84a49733a0a64997e9533812 · TuringLang/DistributionsAD.jl · GitHub

1 Like

Is it possible to ensure that the posterior distribution of L obey these constraints?

Yep; this can be done for arbitrary constraints by just adding -Inf to the logjoint, i.e. putting

if ... # you're check as to whether the constraints are valid
    Turing.@addlogprob! -Inf
    # Do an early return since this entire sample will
    # be rejected anyways.
    return nothing
end

in your model. But this is a bad idea in a lot of cases, e.g. if you’re working with continuous variables and have equality constraints (like you’re wanting to do) since these will be satisfied with probability 0 :confused:

I am also not sure what effect the prior has on the posterior, i.e. should the prior of L also satisfy the constraints?

Exactly. Whenever you can encode the constraints/symmetrics into your prior, do it:) In the particular example you mention you can for example drop the equality-constraint, and instead construct your L by using L12 in both the (1, 2) and the (2, 1) entry.

If indeed the example you posted is exactly what you want to do, i.e. you want a prior on a positive-definite matrix, then you always have the LKJ distribution + sampling the diagonal-terms separately, or even Wishart with parameters ensuring full-rank.

3 Likes

In case anyone stumbles across this post, this worked for me:

@model function fit_onsager(x, y)
    σ ~ truncated(Normal(0, 100), 0, Inf)
    L ~ InverseWishart(2, Matrix{Float64}(I, 2, 2))

    for i=1:size(x, 2)
        y[:, i] ~ MvNormal(L * x[:, i], σ)
    end
end

model = fit_onsager(X, Y)
chain = sample(model, NUTS(0.65), 3_000);

In effect, the inverse Wishart distribution satisfies all my constraints :slight_smile: Note, the sampler sometimes fails to start, complaining about a PosDefException or a SingularException, but if you just restart it a few times it eventually works…

2 Likes