Bivariate normal in Turing

I’m just starting to get to grips with Turing, and enjoying it very much so far. But I’ve run up against a problem with a bivariate normal prior.

using Turing, StatsPlots

@model function pearson_correlation(x)
    rows = size(x, 1)
    # priors
    μ = Vector(undef, 2)
    μ[1] ~ Normal(0, 100)
    μ[2] ~ Normal(0, 100)
    σ₁ ~ Uniform(0, 100)
    σ₂ ~ Uniform(0, 100)
    r ~ Uniform(-1, 1)
    # covariance matrix
    Σ = [σ₁ r*σ₁*σ₂;
         r*σ₁*σ₂ σ₁]
    # likelihood. Loop over observations (rows)
    for i = 1:rows
        x[i,:] ~ MvNormal(μ, Σ)
    end
end

x = [0.8 102; 1 98; 0.5 100; 0.9 105; 0.7 103; 0.4 110; 1.2 99; 1.4 87; 0.6 113; 1.1 89; 1.3 93]

chain = sample(pearson_correlation(x), HMC(0.05, 10), 1000)

density(chain[:r])

I’m getting an error relating to x[i,:] ~ MvNormal(μ, Σ)… From the docs I’m not sure you can define a covariance matrix like this? Any pointers on how to fix this would be much appreciated.

If I run your code, I get

ERROR: MethodError: no method matching MvNormal(::Array{Any,1}, ::Array{Float64,2})

The fix for that is to initialize μ with the right eltype.

@model function pearson_correlation(x, ::Type{T}=Float64) where T
    rows = size(x, 1)
    # priors
    μ = Vector{T}(undef, 2)
    μ[1] ~ Normal(0, 100)
    μ[2] ~ Normal(0, 100)
    σ₁ ~ Uniform(0, 100)
    σ₂ ~ Uniform(0, 100)
    r ~ Uniform(-1, 1)
    # covariance matrix
    Σ = [σ₁ r*σ₁*σ₂;
         r*σ₁*σ₂ σ₁]
    # likelihood. Loop over observations (rows)
    for i = 1:rows
        x[i,:] ~ MvNormal(μ, Σ)
    end
end

But then you still run into issues with your covariance matrix not always being positive definite. Not sure what the best way is to ensure this, but one way to get around this in your current setup is to check for positive definiteness and set the log density to -Inf if !isposdef(Σ). This seems to work:

using LinearAlgebra

@model function pearson_correlation(x, ::Type{T}=Float64) where T
    rows = size(x, 1)
    # priors
    μ = Vector{T}(undef, 2)
    μ[1] ~ Normal(0, 100)
    μ[2] ~ Normal(0, 100)
    σ₁ ~ Uniform(0, 100)
    σ₂ ~ Uniform(0, 100)
    r ~ Uniform(-1, 1)
    # covariance matrix
    Σ = [σ₁ r*σ₁*σ₂;
         r*σ₁*σ₂ σ₁]
    if !isposdef(Σ)
        Turing.@addlogprob! -Inf
        return 
    end
    # likelihood. Loop over observations (rows)
    for i = 1:rows
        x[i,:] ~ MvNormal(μ, Σ)
    end
end

chain = sample(pearson_correlation(x), NUTS(), 1000)

But I guess it would be better to use a different approach to set up your prior for the covariance matrix.

2 Likes

Yep. If anyone knows a more kosher way, then it would be great to see. I’ve tried to look through the example models, but no luck.

Maybe it would be easier if the observed data was normalised to zero mean and unit variance, but still feel like I need an example at this stage of my learning of Turing.

You can use a Wishart prior distribution. Or for your particular case if you replace σ₁*σ₂ on the off-diagonals with sqrt(σ₁*σ₂) and change the second diagonal element to σ₂ instead of σ₁, then the matrix becomes positive definite assuming r is never -1 or 1 exactly and σ₁ and σ₂ are never 0 exactly. So maybe offset those bounds slightly as well.

4 Likes

Turns out I was clumsy. So fixing a mistake in the covariance matrix solves the positive definite problem

using Turing, StatsPlots #, LinearAlgebra

@model function pearson_correlation(x, ::Type{T}=Float64) where T
    rows = size(x, 1)
    # priors
    μ = Vector{T}(undef, 2)
    μ[1] ~ Normal(0, 100)
    μ[2] ~ Normal(0, 100)
    σ₁ ~ Uniform(0, 100)
    σ₂ ~ Uniform(0, 100)
    r ~ Uniform(-0.99, 0.99)
    # covariance matrix
    Σ = [σ₁*σ₁ r*σ₁*σ₂;
         r*σ₁*σ₂ σ₂*σ₂]
    # if !isposdef(Σ)
    #     Turing.@addlogprob! -Inf
    #     return 
    # end
    # likelihood
    for i = 1:rows
        x[i,:] ~ MvNormal(μ, Σ)
    end
end

x = [0.8 102; 1 98; 0.5 100; 0.9 105; 0.7 103; 0.4 110; 1.2 99; 1.4 87; 0.6 113; 1.1 89; 1.3 93]

chain = sample(pearson_correlation(x), HMC(0.05, 10), 10000)

plot(layout=(1,2))
scatter!(x[:,1], x[:,2], legend=false, subplot=1, xlabel="Response Time (sec)", ylabel="IQ", title="Data space")
histogram!(chain[:r], subplot=2, bins=[-1:0.02:1;], normalize=:pdf, alpha=0.5, xlabel="Correlation", ylabel="Posterior density", title="Parameter space", legend=false)

Overall looks good

EDIT: There were some warmup issues with HMC, but works great with NUTS.

1 Like
using Turing, StatsPlots, LinearAlgebra

@model function pearson_correlation(x, ::Type{T} = Float64) where {T}
    rows = size(x, 1)
    # priors
    μ = Vector{T}(undef, 2)
    μ[1] ~ Normal(0, 100)
    μ[2] ~ Normal(0, 100)
    σ₁ ~ Uniform(0, 100)
    σ₂ ~ Uniform(0, 100)
    r ~ Uniform(-0.99, 0.99)
    # covariance matrix
    Σ = Symmetric([
        σ₁*σ₁ r*σ₁*σ₂
        r*σ₁*σ₂ σ₂*σ₂
    ])
    for i = 1:rows
        x[i, :] ~ MvNormal(μ, Σ)
    end
end

Maybe just use Symmetric() to deal with the not positive definite issue.