# CDF of a copula from Copulas.jl?

I am experimenting with `Copulas.jl` to see if I could use it for my work. I am in general impressed. However, I found I could not get the CDF value from a joint distribution constructed by the copula. Getting the PDF value is fine, though. The following is a MWE.

``````using Copulas, Distributions

x = Normal(0, 1)
y = Normal(0, 2)
C = GaussianCopula([1 0.5; 0.5 1])
D = SklarDist(C, (x,y))

pdf(D, ([1.0, 1.0])) # this is fine
cdf(D, ([1.0, 1.0])) # this produces error
``````

The last line produces the method error: `"ERROR: MethodError: no method matching cdf(::GaussianCopula{2, Matrix{Float64}}, ::Vector{Float64})"`.

Any suggestion would be appreciated.

1 Like

`cdf` is more complicated function, if you have just one-variate distribution you can use QuadGK.jl, for multivariate - HCubature.jl or Quasi-Monte-Carlo approaches like in MvNormalCDF.jl

Indeed, CDF values can be computed using numerical integrations. The reason I brought it up is because the package’s document indicates that the value could be obtained using the usual `cdf()` function, and that (as I just found out) it actually returns cdf values for Archimedean copulas though not for the more-often-used Gaussian copulas (which is what in my original question).

This is stated in the package’s doc: “This package brings most standard copula features into native Julia: random number generation, pdf and cdf, fitting, copula-based multivariate distributions through Sklar’s theorem, etc., while fully complying with the Distributions.jl API…”

The following code shows that it does return cdf values for Archimedean copulas.

``````julia> using Copulas, Distributions

julia> x = Normal(0,1); y = Normal(0,2);

julia> C = GumbelCopula(2, 1.2)  # a type of Archimedean copula
GumbelCopula{2, Float64}(θ=1.2)

julia> D = SklarDist(C, (x,y))
SklarDist{GumbelCopula{2, Float64}, Tuple{Normal{Float64}, Normal{Float64}}}(
C: GumbelCopula{2, Float64}(θ=1.2)
m: (Normal{Float64}(μ=0.0, σ=1.0), Normal{Float64}(μ=0.0, σ=2.0))
)

julia> cdf(D, ([1.0, 1.0]))
0.6132230493600664
``````

So, the `cdf()` works for Archimedean copulas but not for Gaussian copulas. Maybe there is a feature not been implemented yet? Or a bug? @lrnv

2 Likes

Indeed, this is simply not implemented yet, the package is not complete at all. Should be a quick fix though, could you open an issue on the repo about it, I’ll take a look as soon as I can ?

Thanks for catching it !

1 Like

I fixed `Copulas.jl` (in the latest master commit), so that it provides a more accurate error: we rely on `Distributions.jl`’s implementation of the multivariate Gaussian and Student distributions to handle the complexity behind the Gaussian and Student copulas, and `Distributions.jl` has no multivariate Gaussian `cdf`.

Here is a Github issue for this : CDF of Multivariate Normal Distribution · Issue #260 · JuliaStats/Distributions.jl · GitHub

Also, seems relevant there : MVN cdf by blackeneth · Pull Request #114 · JuliaStats/StatsFuns.jl · GitHub

But it does not look like it is going to be fixed soon. If you find somewhere an integration routine that computes the `cdf` of a multivariate Gaussian, I am willing to include it as a temporary workaround though, until those issues are fixed in `StatsFun.jl` and `Distributions.jl`

I oppened an issue myself on Copulas.jl there : Elliptical CDF · Issue #15 · lrnv/Copulas.jl · GitHub to cross-link everything.

1 Like

Thanks for looking into the issue! I saw that you already open an issue in the repo.

It’s somewhat surprising to know that the cdf of multivariate Gaussians has not been implemented in `Distributions.jl` and thus cannot be (more or less) easily applied by `Copulas.jl`. On the other hand, as @PharmCat mentioned in the 2nd post above, the cdf can now be calculated using packages like `MvNormalCDF.jl`. Could `Copulas.jl` use this or other packages to fix the problem?

1 Like

Hi!
Unlikely that MvNormal `cdf` will be implemented in Distributions.jl or StatsFuns.jl , this needs multidimensional integration solution. @blackeneth provide Quasi-Monte-Carlo solution as it done in MATLAB and I make package (MvNormalCDF.jl) and include some improvements. Maybe this is most robust solution now.

PS if some things needs to be implemented in MvNormalCDF - just open issue, I will try to do that

1 Like

Yes we should try with MvNormalCdf.jl. you can make a PR if you want or wait until tomorrow, I’ll do it.

1 Like

Thanks to @PharmCat’s volunteer. I have opened an issue.

If I understand correctly - anyway there is no analytical solution for this type, so if `cdf` will be implemented - it will be numerical method. Quasi-Monte-Carlo as I understand have sense in high-dimentional cases in your case any numerical method can be used.

If I understand correctly (point me if I’m wrong) - this code could be used to get `cdf`:

``````using Copulas, Distributions, BenchmarkTools
import Cubature, HCubature

x = Normal(0, 1)
y = Normal(0, 2)
C = GaussianCopula([1 0.5; 0.5 1])
D = SklarDist(C, (x,y))

pdf(D, [1.0, 1.0]) # this is fine
#cdf(D, ([1.0, 1.0]))

func(x) = pdf(D, [x, x])

HCubature.hcubature(func, [-8.0, -8.0], [8.0, 8.0], rtol=sqrt(eps()))
#(0.9999366573436514, 1.4884341341974467e-8)
Cubature.hcubature(func, [-8.0, -8.0], [8.0, 8.0], reltol=sqrt(eps()))
#(0.9999366573436526, 1.4884341344610422e-8)
Cubature.pcubature(func, [-8.0, -8.0], [8.0, 8.0], reltol=sqrt(eps()))
#(0.9999366575163308, 1.2989609388114332e-14)

@btime HCubature.hcubature(func, [-8.0, -8.0], [8.0, 8.0], rtol=sqrt(eps()))
#95.819 ms (791157 allocations: 36.17 MiB)

@btime Cubature.hcubature(func, [-8.0, -8.0], [8.0, 8.0], reltol=sqrt(eps()))
#83.939 ms (688260 allocations: 32.91 MiB)

@btime Cubature.pcubature(func, [-8.0, -8.0], [8.0, 8.0], reltol=sqrt(eps()))
#16.128 ms (125790 allocations: 6.01 MiB)
``````
1 Like

Thanks; that’s helpful. On the other hand, I was thinking of making `cdf()` workable to all elliptical copulas in `Copulas.jl`, as it currently does for the Archimedean copulas. So that, for example, `cdf(D, ([0.1, 0.2]))` would return the CDF value where `D` is constructed by `GaussianCopula`. It probably means that the `HCubature` or `MvNormalCDF` may have to be called by `Copulas`.

In terms of `HCubature` (used before) vs `MvNormalCDF` (never used): My experience with an old problem was that `HCubature` is quite good (fast and accurate) if the dimension is like <=4. For higher dimensions, the convergence rate of the cubature method deteriorates quickly and the estimation becomes very slow, leaving the quasi-Monte Carlo method the only choice (I coded myself using low discrepancy sequences such as the Halton). So, if `Copulas.jl` is to implement `cdf()` for elliptical copulas, the quasi-Monte Carlo method may be the choice for its generality, unless the package decides to support the `cdf()` up to a certain dimension.

Just to clarify: my original intent of this thread is not too much about a one-time calculation of the CDF, but more about using `pdf()` and `cdf()` in a consistent way in `Copulas.jl`. Sorry that it was not made clear in the original post.

(edited to add the last paragraph)

1 Like

If I understand correctly `cdf` win MvNormalCDF can be calculated:

``````using MvNormalCDF
using Copulas, Distributions

x = Normal(0, 1)
y = Normal(0, 2)
C = GaussianCopula([1 0.5; 0.5 1])
D = SklarDist(C, (x,y))

####
μ = mean.(D.m)
σ = collect(std.(D.m))
Σ = cor2cov(D.C.Σ, σ)
MvNormalCDF.mvnormcdf(μ, Σ, [-1.0, -2.0], [3.0, 4.0], m = 6000)
#(0.7216874201273508, 4.10731083253638e-5)
####
func(x) = pdf(D, [x, x])
Cubature.pcubature(func, [-1.0, -2.0], [3.0, 4.0], reltol=sqrt(eps()))
#(0.7217122929674319, 8.450906641144229e-12)
``````

But I think this solution can’t be applied to other distributions. `MvNormalCDF ` based on MATLAB qsimvnv() function by Alan Genz with using Richtmyer generators and use propeties of MvNormal distribution (see ref). Other multivariate types should implement general Quasi-Monte-Carlo integration tecniques. So I didn’t found packages for QMC multidimentional integration, may be someone wants to make it one day.

2 Likes

What about computing the LU decomposition of the covariance matrix and using the univariate gaussian cdf ? I think this might be faster, and I do not really want to depend on Matlab to stay type agnostic. I think it might even work for any elliptical.

I’ll give it a shot asap.

1 Like

I don’t know … When I said that `MvNormalCDF ` based on MATLAB - I mean than code was rewritten from MATLAB utilitie - ofcource `MvNormalCDF ` is pure Julia package with minimal dependencies.

I found package MonteCarloIntegration.jl , I didn’t found QuasiMonteCarlo integration, but this QuasiMonteCarlo can helps. Maybe I can make mini-mackage for QuasiMonteCarloIntegratio if it may bu usefull.

1 Like

It should now be fixed on 0.1.5, currently registering.

@HJW019, I think you have now the behavior you wanted for all elliptical copulas, which is IMHO the correct behavior of the package. It might be slow for some cases (especially high dimensional…), but should be correct. It uses `Cubatures.jl`, but if you want to propose a version with QMC you can of course.

For Gaussian copulas, I used the fact that uncorrelated Gaussian random vectors and independent, which is maybe not the most performing algorithm, but its clean, and does “only” require a linear solve and a matrix square root.

Thanks @PharmCat and @HJW019 for your input, do not hesitate to tell me if it works as expected !

1 Like

Wonderful! Thanks for the contribution. Now I think the package has all the basic functions for a typical copula analysis. Thanks again.

You are welcome, please come back if something is wrong !

++

I did a quick check on the updated `cdf()` and was puzzled by the finding. The result of `cdf()` on the Gaussian copula is inconsistent with results from three alternative ways of computing the CDF. Meanwhile, the three alternatives are consistent among themselves.

``````using Copulas, BivariateCopulas, Distributions, StatsBase, MvNormalCDF, Cubature

rho = 0.5
ub1, ub2 = -0.1, 0.1
x = Normal(0, 1)
y = Normal(0, 2)

#### Copulas.jl

C1 = Copulas.GaussianCopula([1 rho; rho 1])
D1 = SklarDist(C1, (x,y))
cdf(D1, [ub1, ub2]) # 0.24018536633697982

#### alternative 1: Cubature on D1

func(x) = pdf(D1, [x, x])
Cubature.pcubature(func, [-15, -15], [ub1, ub2], reltol=sqrt(eps())) # (0.32190029773358525, 3.3549441003088987e-11)

#### alternative 2: MvNormalCDF on D1

μ = mean.(D1.m)
σ = collect(std.(D1.m))
Σ = cor2cov(D1.C.Σ, σ)
MvNormalCDF.mvnormcdf(μ, Σ, [-Inf, -Inf], [ub1, ub2], m = 10000) # (0.3219097660568473, 1.5325442373789225e-5)

#### alternative 3: BivariateCopulas.jl

C2 = BivariateCopulas.Gaussian(rho)
D2 = C2(x, y);
cdf(D2, ub1, ub2) # 0.3219002977336174
``````

As you can see, the first result of `0.24018536633697982` is quite different from others.

I looked at the source code:

``````function Distributions.cdf(C::CT,u) where {CT<:Copula}
f(x) = pdf(C,x)
z = zeros(eltype(u),length(C))
return Cubature.pcubature(f,z,u,reltol=sqrt(eps()))
end
``````

The line on `z` seems to indicate that the lower bound of the integration is a vector of 0. However, I think it should be the lower bounds of the marginals of `X` and `Y`. In the above example where both of the marginals are normal, the domain should be in the range of (-Inf, u).

How to fix the program? I am scratching my head, as I don’t know how to pull out the lower bounds of user-specified marginals.

Indeed, there is a bug. This is troubling as the lower bound of the copula pdf should be zero and not `-Inf`… The copula pdf is not integrating on the marginal scales.

I’ll make a test out of your example and try to fix it. Thanks again 