Cdf of multivariate normal in distributions.jl

I’m trying to use the cdf of a multivariate normal distribution using Distributions.jl.
While

d = Normal(0,1)
d2 = MvNormal([0.;0.], [1. 0.;0. 1.])
x = [1. ; 2.]
e1 = cdf(d,x[1])
e2 = cdf(d,x[2])

works (as long as the mean vector and the var-cov matrix for the multivariate normal are Floats ), I get an error whenever I write
e3 = cdf(d2,x)
Here is the error message:

ERROR: MethodError: no method matching cdf(::Distributions.MvNormal{Float64,PDMats.PDMat{Float64,Array{Float64,2}},Array{Float64,1}}, ::Array{Float64,1})
Closest candidates are:
  cdf(::Distributions.Distribution{Distributions.Univariate,S<:Distributions.ValueSupport}, ::AbstractArray{T,N}) at /Applications/JuliaPro-0.5.1.1.app/Contents/Resources/pkgs-0.5.1.1/v0.5/Distributions/src/univariates.jl:205

Does the cdf function support multivariate distributions? I couldn’t find anything on this in the documentation of Distributions.jl.

No, there is not a cdf for the multivariate normal. Generally, it is a slightly complicated computation. We have some code evaluating the bi- and trivariate case but it hasn’t been used for a long time. For some time, I’ve wanted a dedicated package for multivariate distributions that would be using StaticArrays for storage. The cdf code for fit well in such a package.

That’s a shame. We really should have something in Julia for this. I’ll work around it using PyCall for the time being.

Are you mainly interested in 2D and 3D or higher dimensions? Do you know how higher dimensions are handled in Python?

My interest is currently for 2D. I have no idea how higher dimensions are handled in Python right now but hopefully I’ll learn a little bit about it in the next few days.

If you are interested in spending a little time on it, we could probably get the 2D Julia version working again. If we do it with StaticArrays it will also be extremely fast.

Hello,
I am also interested in a julia version for the cdf for 2D normal. I might be able to spend a bit time on this, but where can I find the version that exists already? I am not familiar with StaticArrays though, but I can certainly give it a try.
Thanks!

It is here https://github.com/JuliaStats/StatsFuns.jl/blob/e21bc26b1773aeb86bc8df4832db9c371d09ba2b/src/tvpack.jl#L338

I think it would be very helpful to have a CDF defined for MvNormal, analog to MATLAB’s mvncdf. It would definitely help when dealing with truncated Gaussian priors, etc.

I’m working on it (shh…)

I was surprised to find that there is a whole book about this:

I have the bivariate case in https://github.com/mschauer/GaussianDistributions.jl:

https://github.com/mschauer/GaussianDistributions.jl/blob/master/src/bivariate.jl

Ah, I did not know we had that!

Yes, Prof. Genz literally “wrote the book on it”!

I apologize if this is a necro-bump, but came across this thread as the first Google Search entry.

Has the CDF for Multivariate normals been implemented in Julia now? I essentially looking for an analogous to just sample the CDF of a normal distribution, maybe something akin to R’s pnorm.

Edit: Found what I wanted with the combination of Univariate Distributions · Distributions.jl and Univariate Distributions · Distributions.jl

E.g.:

julia> dist = Distributions.Normal(0, 1)
Distributions.Normal{Float64}(μ=0.0, σ=1.0)

julia> Distributions.cdf(dist, 3/640) - Distributions.cdf(dist, -1/128)
0.00498673995194876

MvNormalCDF.jl at GitHub - PharmCat/MvNormalCDF.jl: Quasi-Monte-Carlo numerical computation of multivariate normal probabilities

Thanks for the link!

Looks like here is the open issue: CDF of Multivariate Normal Distribution · Issue #260 · JuliaStats/Distributions.jl · GitHub

I used HCubature to calculate integral of PDF. Maybe not the fastest, but it works

using HCubature, Distributions, Plots, LinearAlgebra
plotly()
μ = [0.0, 0.0]
Σ = [1.0 0.0; 0.0 1.0]

my_pdf(x) = ((2*pi)^2*det(Σ))^(-0.5)*exp(-0.5*transpose(x-μ)*inv(Σ)*(x-μ))
mv_norm_cdf(x,y)= hcubature(x -> my_pdf(x), [-10.0, -10.0], [x, y])[1]

xs = -5:0.5:5
ys = -5:0.5:5

surface(xs,ys, mv_norm_cdf, xlabel="X", ylabel="Y", zlabel="CDF", color=:viridis, zlim=[0,1])


I optimalized my code so it works faster. I used 99.7 rule to define the interval of calculation.


using HCubature, Distributions, Plots, BenchmarkTools,Statistics, StatsBase,Random


function mv_norm_cdf(x,y , μ, Σ)
    x_low = μ[1]-3*Σ[1,1]
    y_low = μ[2]-3*Σ[2,2]
    dist = MvNormal(μ,Σ)
    return hcubature(x -> pdf(dist,x), [x_low, y_low], [x, y])[1]
end

function mv_norm_cdf_plot(μ, Σ)
    x_low = μ[1]-3*Σ[1,1]
    y_low = μ[2]-3*Σ[2,2]
    x_high = μ[1]+3*Σ[1,1]
    y_high = μ[2]+3*Σ[2,2]
    dist = MvNormal(μ,Σ)
    _mv_norm_cdf(x,y) = hcubature(x -> pdf(dist,x), [x_low, y_low], [x, y])[1]

    xs = x_low:0.5:x_high
    ys = y_low:0.5:y_high

    surface(xs,ys, _mv_norm_cdf, xlabel="X", ylabel="Y", zlabel="CDF", color=:viridis, zlim=[0,1])
end

Here is example use

μ = [0.0, 0.0]
Σ = [1.0 0.0; 0.0 1.0]

mv_norm_cdf(0,0,μ,Σ)
mv_norm_cdf_plot(μ,Σ)

image

Also to compare example of empirical cdf

function mv_normal_ecdf(μ,Σ)
    x_low = μ[1]-3*Σ[1,1]
    y_low = μ[2]-3*Σ[2,2]
    x_high = μ[1]+3*Σ[1,1]
    y_high = μ[2]+3*Σ[2,2]
    dist = MultivariateNormal(μ, Σ)
    n = 1000
    sample = rand(dist, n)
    
    ecdf(x,y) = sum((sample[1,i]<x && sample[2,i]<y) for i in 1:n)/n
    
    xs = x_low:0.25:x_high
    ys = y_low:0.25:y_high
    
    surface(xs,ys,ecdf, xlabel="X", ylabel="Y", zlabel="ECDF", color=:viridis, zlim=[0,1])
end

image

Can you add a plot of the difference also?