Cdf of multivariate normal in distributions.jl

I’m trying to use the cdf of a multivariate normal distribution using Distributions.jl.
While

d = Normal(0,1)
d2 = MvNormal([0.;0.], [1. 0.;0. 1.])
x = [1. ; 2.]
e1 = cdf(d,x[1])
e2 = cdf(d,x[2])

works (as long as the mean vector and the var-cov matrix for the multivariate normal are Floats ), I get an error whenever I write
e3 = cdf(d2,x)
Here is the error message:

ERROR: MethodError: no method matching cdf(::Distributions.MvNormal{Float64,PDMats.PDMat{Float64,Array{Float64,2}},Array{Float64,1}}, ::Array{Float64,1})
Closest candidates are:
  cdf(::Distributions.Distribution{Distributions.Univariate,S<:Distributions.ValueSupport}, ::AbstractArray{T,N}) at /Applications/JuliaPro-0.5.1.1.app/Contents/Resources/pkgs-0.5.1.1/v0.5/Distributions/src/univariates.jl:205

Does the cdf function support multivariate distributions? I couldn’t find anything on this in the documentation of Distributions.jl.

1 Like

No, there is not a cdf for the multivariate normal. Generally, it is a slightly complicated computation. We have some code evaluating the bi- and trivariate case but it hasn’t been used for a long time. For some time, I’ve wanted a dedicated package for multivariate distributions that would be using StaticArrays for storage. The cdf code for fit well in such a package.

3 Likes

That’s a shame. We really should have something in Julia for this. I’ll work around it using PyCall for the time being.

1 Like

Are you mainly interested in 2D and 3D or higher dimensions? Do you know how higher dimensions are handled in Python?

My interest is currently for 2D. I have no idea how higher dimensions are handled in Python right now but hopefully I’ll learn a little bit about it in the next few days.

If you are interested in spending a little time on it, we could probably get the 2D Julia version working again. If we do it with StaticArrays it will also be extremely fast.

1 Like

Hello,
I am also interested in a julia version for the cdf for 2D normal. I might be able to spend a bit time on this, but where can I find the version that exists already? I am not familiar with StaticArrays though, but I can certainly give it a try.
Thanks!

It is here https://github.com/JuliaStats/StatsFuns.jl/blob/e21bc26b1773aeb86bc8df4832db9c371d09ba2b/src/tvpack.jl#L338

1 Like

I think it would be very helpful to have a CDF defined for MvNormal, analog to MATLAB’s mvncdf. It would definitely help when dealing with truncated Gaussian priors, etc.

I’m working on it (shh…)

2 Likes

I was surprised to find that there is a whole book about this:

1 Like

I have the bivariate case in https://github.com/mschauer/GaussianDistributions.jl:

https://github.com/mschauer/GaussianDistributions.jl/blob/master/src/bivariate.jl

2 Likes

Ah, I did not know we had that!

Yes, Prof. Genz literally “wrote the book on it”!

I apologize if this is a necro-bump, but came across this thread as the first Google Search entry.

Has the CDF for Multivariate normals been implemented in Julia now? I essentially looking for an analogous to just sample the CDF of a normal distribution, maybe something akin to R’s pnorm.

Edit: Found what I wanted with the combination of Univariate Distributions · Distributions.jl and Univariate Distributions · Distributions.jl

E.g.:

julia> dist = Distributions.Normal(0, 1)
Distributions.Normal{Float64}(μ=0.0, σ=1.0)

julia> Distributions.cdf(dist, 3/640) - Distributions.cdf(dist, -1/128)
0.00498673995194876

MvNormalCDF.jl at GitHub - PharmCat/MvNormalCDF.jl: Quasi-Monte-Carlo numerical computation of multivariate normal probabilities

1 Like

Thanks for the link!

Looks like here is the open issue: CDF of Multivariate Normal Distribution · Issue #260 · JuliaStats/Distributions.jl · GitHub

1 Like

I used HCubature to calculate integral of PDF. Maybe not the fastest, but it works

using HCubature, Distributions, Plots, LinearAlgebra
plotly()
μ = [0.0, 0.0]
Σ = [1.0 0.0; 0.0 1.0]

my_pdf(x) = ((2*pi)^2*det(Σ))^(-0.5)*exp(-0.5*transpose(x-μ)*inv(Σ)*(x-μ))
mv_norm_cdf(x,y)= hcubature(x -> my_pdf(x), [-10.0, -10.0], [x, y])[1]

xs = -5:0.5:5
ys = -5:0.5:5

surface(xs,ys, mv_norm_cdf, xlabel="X", ylabel="Y", zlabel="CDF", color=:viridis, zlim=[0,1])


1 Like

I optimalized my code so it works faster. I used 99.7 rule to define the interval of calculation.


using HCubature, Distributions, Plots, BenchmarkTools,Statistics, StatsBase,Random


function mv_norm_cdf(x,y , μ, Σ)
    x_low = μ[1]-3*Σ[1,1]
    y_low = μ[2]-3*Σ[2,2]
    dist = MvNormal(μ,Σ)
    return hcubature(x -> pdf(dist,x), [x_low, y_low], [x, y])[1]
end

function mv_norm_cdf_plot(μ, Σ)
    x_low = μ[1]-3*Σ[1,1]
    y_low = μ[2]-3*Σ[2,2]
    x_high = μ[1]+3*Σ[1,1]
    y_high = μ[2]+3*Σ[2,2]
    dist = MvNormal(μ,Σ)
    _mv_norm_cdf(x,y) = hcubature(x -> pdf(dist,x), [x_low, y_low], [x, y])[1]

    xs = x_low:0.5:x_high
    ys = y_low:0.5:y_high

    surface(xs,ys, _mv_norm_cdf, xlabel="X", ylabel="Y", zlabel="CDF", color=:viridis, zlim=[0,1])
end

Here is example use

μ = [0.0, 0.0]
Σ = [1.0 0.0; 0.0 1.0]

mv_norm_cdf(0,0,μ,Σ)
mv_norm_cdf_plot(μ,Σ)

image

Also to compare example of empirical cdf

function mv_normal_ecdf(μ,Σ)
    x_low = μ[1]-3*Σ[1,1]
    y_low = μ[2]-3*Σ[2,2]
    x_high = μ[1]+3*Σ[1,1]
    y_high = μ[2]+3*Σ[2,2]
    dist = MultivariateNormal(μ, Σ)
    n = 1000
    sample = rand(dist, n)
    
    ecdf(x,y) = sum((sample[1,i]<x && sample[2,i]<y) for i in 1:n)/n
    
    xs = x_low:0.25:x_high
    ys = y_low:0.25:y_high
    
    surface(xs,ys,ecdf, xlabel="X", ylabel="Y", zlabel="ECDF", color=:viridis, zlim=[0,1])
end

image

1 Like

Can you add a plot of the difference also?