Copulas.jl + Turing.jl - numerical stability?

Hi all,

I am a big fan of both libraries, but when I try to include copulas in probabilistic (Turing) models, I often get errors related to quantities being out of bounds. I have included a simple example below.

The below Turing model runs fine :slight_smile:

using Distributions, Random, Copulas

μ₁_prior = Normal(3.0, 1/2)
σ₁_prior = Normal(0, 1/2) |> dist -> truncated(dist, lower = 0) 

μ₂_prior = Normal(1.15, 1/3)
σ₂_prior = Normal(0, 1/3) |> dist -> truncated(dist, lower = 0) 

ρ_prior = Normal(-1/3, 1/4) |> dist -> truncated(dist, lower = -1, upper = 1)

@model function multi_var_meas_error(; x_meas, ϵ::Float64)
    # starting points (priors)
    ρ ~ ρ_prior
    μ₁ ~ μ₁_prior; σ₁ ~ σ₁_prior
    μ₂ ~ μ₂_prior; σ₂ ~ σ₂_prior
    
    # Likelihood
    # Defining marginal distributions and copula
    Σ = [
        σ₁^2            ρ * σ₁ * σ₂  
        ρ * σ₁ * σ₂     σ₂^2
    ]
    
    # For numerical stability, and a positive-definite covariance matrix:
    Σ = [i == j ? Σ[i, j] + eps() : Σ[i, j] for i ∈ 1:2, j ∈ 1:2]
    
    # The input x_meas includes the measurement error ϵ
    # Estimate the true levels without the measurement error
    x₁ ~ Normal(x_meas[1], ϵ) |> d -> truncated(d, lower = 0)
    x₂ ~ Normal(x_meas[2], ϵ) |> d -> truncated(d, lower = 0)
    
    Turing.@addlogprob! logpdf(
        MvNormal([μ₁, μ₂], Σ), 
        [x₁, x₂])

    # Turing.@addlogprob! logpdf(
    #     SklarDist(
    #         GaussianCopula(Σ),
    #         (Normal(μ₁, σ₁) |> d -> truncated(d, lower = 0), 
    #         Normal(μ₂, σ₂) |> d -> truncated(d, lower = 0))
    #     ),    
    #     [x₁, x₂])
end

n_draws = 1_000; n_warmup = 10_000; n_chains = 4; target_acc = 0.65;
mcmc_sampler = Turing.NUTS(n_warmup, target_acc)

posterior = multi_var_meas_error(x_meas = [2.5, 1.0], ϵ = 0.2) |>
    x -> sample(x, mcmc_sampler, MCMCThreads(),
                n_draws, n_chains, seed = 231123)

…however, when I comment out the the MvNormal, and uncomment the Sklar distribution with the Gaussian copula (using the same covariance matrix)…

...

    # Turing.@addlogprob! logpdf(
    #     MvNormal([μ₁, μ₂], Σ), 
    #     [x₁, x₂])

    Turing.@addlogprob! logpdf(
        SklarDist(
            GaussianCopula(Σ),
            (Normal(μ₁, σ₁) |> d -> truncated(d, lower = 0), 
            Normal(μ₂, σ₂) |> d -> truncated(d, lower = 0))
        ),    
        [x₁, x₂])
end

…then I get the following error:

ERROR: TaskFailedException

nested task error: TaskFailedException

    nested task error: DomainError with 1.0000000000000058:
    `abs(x)` cannot be greater than 1.

Yes, this is painfull.

The fundamental issues is that Gaussian copulas have no closed form expressions – we need to pass through inversion of the Gaussian cdf back and forth, which is a numerically instable computations.

The less-fundamental issue is that currently cdf(Normal(),x) from Distributions.jl can end up out of bound with negative or > 1 values…

Could you open an issue on Copulas.jl ? Maybe we should simply enforce these bounds when doing inversion.

Btw i endeed up on this version of your code:

@model function multi_var_meas_error(; x_meas, ϵ::Float64)
    μ₁ ~ Normal(3.0, 1/2)
    μ₂ ~ Normal(1.15, 1/3)
    σ₁ ~ TruncatedNormal(0, 1/2, 0, Inf)
    σ₂ ~ TruncatedNormal(0, 1/3, 0, Inf) 
    ρ  ~ TruncatedNormal(-1/3, 1/4, -1, 1)

    Model = SklarDist(GaussianCopula([1 ρ; ρ 1]),(
        TruncatedNormal(μ₁, σ₁^2+ϵ, 0.0, Inf),
        TruncatedNormal(μ₂, σ₂^2+ϵ, 0.0, Inf),
    ))
    Turing.@addlogprob! loglikelihood(Model, x_meas)
end
1 Like

Tried to reproduce the error in the OP, but having a different error when precompiling Copulas:

Copulas [ae264745-0b69-425e-9d9d-cf662c5eec93]

Failed to precompile Copulas [ae264745-0b69-425e-9d9d-cf662c5eec93] to "/home/dan/.julia/compiled/v1.10/Copulas/jl_MaQbtw".
ERROR: LoadError: MethodError: no method matching isless(::TaylorSeries.Taylor1{Float64}, ::TaylorSeries.Taylor1{Float64})

Closest candidates are:
  isless(::Missing, ::Any)
   @ Base missing.jl:87
  isless(::Any, ::Missing)
   @ Base missing.jl:88

Stacktrace:
  [1] max(x::TaylorSeries.Taylor1{Float64}, y::TaylorSeries.Taylor1{Float64})
    @ Base ./operators.jl:476
  [2] ϕ(G::Copulas.ClaytonGenerator{Float64}, t::TaylorSeries.Taylor1{Float64})
    @ Copulas ~/.julia/packages/Copulas/3AzcN/src/Generator/UnivariateGenerator/ClaytonGenerator.jl:43

My version information is:

julia> versioninfo()
Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 16 × 12th Gen Intel(R) Core(TM) i5-1240P
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, alderlake)

Have you bumped into this error?

No not yet, this one is really weird. Have you tried updating your packages ? or more directly what are versions of Copulas and TaylorSeries ?

1 Like

I was at 0.13.2… updating to 0.16.0 now. Maybe this will solve the problem…

UPDATE: That fixed it. Thanks

2 Likes

So the issue really was that Distributions.jl’s cdf(Normal(),x) can be outside of the unit range… :rage: and your example did trigger that.

I fixed it on Copulas.jl#lrnv/issue196 branch by a violent clamp() call :anger: , let us check that it works and then I’ll merge & push onto the registry.

Summary
using Distributions, Random, Copulas, Turing

@model function multi_var_meas_error(; x_meas, ϵ::Float64)
    # starting points (priors)
    μ₁ ~ Normal(3.0, 1/2)
    μ₂ ~ Normal(1.15, 1/3)
    σ₁ ~ TruncatedNormal(0, 1/2, 0, Inf)
    σ₂ ~ TruncatedNormal(0, 1/3, 0, Inf) 
    ρ  ~ TruncatedNormal(-1/3, 1/4, -1, 1)
    
    # Likelihood
    # Defining marginal distributions and copula
    Σ = [
        σ₁^2            ρ * σ₁ * σ₂  
        ρ * σ₁ * σ₂     σ₂^2
    ]
    
    # For numerical stability, and a positive-definite covariance matrix:
    Σ = [i == j ? Σ[i, j] + eps() : Σ[i, j] for i ∈ 1:2, j ∈ 1:2]
    
    # The input x_meas includes the measurement error ϵ
    # Estimate the true levels without the measurement error
    x₁ ~ TruncatedNormal(x_meas[1], ϵ, 0, Inf)
    x₂ ~ TruncatedNormal(x_meas[2], ϵ, 0, Inf)
    
    # Turing.@addlogprob! logpdf(
    #     MvNormal([μ₁, μ₂], Σ), 
    #     [x₁, x₂])

    Turing.@addlogprob! logpdf(
        SklarDist(
            GaussianCopula(Σ),
            (TruncatedNormal(x_meas[1], ϵ, 0, Inf), 
            TruncatedNormal(x_meas[2], ϵ, 0, Inf))
        ),    
        [x₁, x₂])
end

n_draws = 1_000; n_warmup = 10_000; n_chains = 4; target_acc = 0.65;
mcmc_sampler = Turing.NUTS(n_warmup, target_acc)

posterior = multi_var_meas_error(x_meas = [2.5, 1.0], ϵ = 0.2) |>
    x -> sample(x, mcmc_sampler, MCMCThreads(),
                n_draws, n_chains, seed = 231123)
1 Like

Haven’t read all this in detail, but if these values are actually something you want to assign zero probability, you can just do

if value_is_not_okay
    Turing.@addlogprob! -Inf  # leads to automatic rejection
    return nothing # just return early because we'll reject this anyways
end
1 Like