Multivariate Normal with Positive Semi-Definite Covariance Matrix

question

#1

Hi All,

I’m using julia to sample from a gaussian process with squared exponential kernel.

using Distributions, Distances

x_star = -5:0.1:5
K = exp(-0.5 * pairwise(SqEuclidean(), x_star'))
f_star = rand(MvNormal(K), 1)

fails with a Base.LinAlg.PosDefException. In this example, the points are close together and K becomes singular i.e. det(K) # 0.0.

The equivalent calculation in R:

x.star <- seq(-5, 5, by=0.1)
K = exp(-0.5 * rdist::pdist(x.star) ^ 2)
f = mvtnorm::rmvnorm(1, sigma=K)

works fine, because the mvtnorm package only requires Positive Semi-definite matrices for the covariance (due to the multiple options for matrix decomposition I guess? The “cholesky” method fails, but “svd” and “eigen” both work fine).

This leads me to my question: is it possible to sample from a multivariate normal distribution in Julia with a PSD matrix? Is there some argument/coercion that I’m missing?

By the way, you can “hack” your way around this error by adding a small constant to the diagonal elements of K, e.g.

x_star = -5:0.1:5
K_psd = exp(-0.5 * pairwise(SqEuclidean(), x_star'))
K_pd = copy(K_psd);
for i in 1:length(x_star)
    K_pd[i, i] = K_pd[i, i] + 0.001
end
det(K_pd) # 2.00e-256
det(K_psd) # 0.00
f_star = rand(MvNormal(K_pd), 1)

but it would be nice to not have to do this.


Cholesky factorization for slightly non-Hermitian matrices
#2

I don’t think this is in the Distributions package yet.

I would use the LDLt factorization (see the docs for ldltfact()). If K = LDL.’, then I’d take an iid normal vector and left-multiply by LD^{1/2}.


#3

Looking at the source code, it seems that rmvnorm uses eigenvalues (not Cholesky) by default, and it accepts eigenvalues as “non-negative” if λ ≥ – max|λ| √ε.


#4

Thanks @stevengj. Do you think the JuliaStats organization would be open to a pull request to add eigenvalue decomposition as an optional method for the MvNormal?


#5

I think we’d be happy for such a contribution but it might require quite a bit of restructuring of the existing code. Some of it might also need a cleanup and I think the existing code might be more complicated than needed. I don’t know how familiar you are with degenerate normals but handling it will complicate some methods so that part might also require some work. Finally, it varies how much time the developers have available so a PR might take a while to get reviewed. It shouldn’t be like that but that is how it is.


#6

Thanks @andreasnoack. I’m still learning about Julia, but I’m familiar with degenerate multivariate normals. Just as a thought, I see in the documentation, that there are three different constructors for multivariate normal, depending on the covariance structure. What do you think of adding a fourth constructor (e.g. DegenNormal or something similar) to handle the cases where the covariance matrix is not of full rank? Or does that make things to confusing?


#7

Since Cholesky is so much faster than eigenvalue computation, I think it would make more sense to first try cholfact in a try block by default. Then, if it catches a positive-definite exception, it can do a (hermitian) diagonalization and check whether the negative eigenvalues are sufficiently small to ignore.


#8

@stevengj That seems reasonable to me. I think it would be nice to have an optional argument to force an eigen decomposition without trying the cholesky decomposition first. I don’t think this would be too challenging to add to the MvNormal constructor, aside from the current requirement for a PDMat. Thoughts?


#9

An eigendecomposition is 25–50x more expensive than Cholesky on my machine, so you will hardly notice that it is trying Cholesky first.


#10

@stevengj I knew that cholesky decomposition was faster than an eigen decomposition, but didn’t realize there could be that much of an advantage… Agreed then - trying cholesky first would be hardly noticeable, especially if you are trying multiple covariance matrices that may or may not be PD.

So do you think it makes sense to modify the current MvNormal class which requires a PDMat? The \Sigma argument could be of type PSDMat (or something like that) and the constructor will first try a cholesky decomposition (as you said), followed by the eigen decomposition? Or should this class be split up into two?


#11

I’ve faced the same problem. I hope your team fix this bug and make this package more suitable for enjoying bayesian or statistical machine learning :slight_smile:


#12

I have also faced this problem.

Is there a solution for the present situation?


#13

Just wanted to chime in that I have the same issue; advice would be appreciated.

julia> x_hat_F
2-element Array{Float64,1}:
  1.5999999999999996
 -1.3333333333333328

julia> Σ_F
2×2 Array{Float64,2}:
 0.133333  0.1
 0.1       0.15

julia> MvNormal(x_hat_F, Σ_F)
ERROR: PosDefException: matrix is not Hermitian; Cholesky factorization failed.
Stacktrace:
 [1] checkpositivedefinite(::Int64) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.0\LinearAlgebra\src\factorization.jl:11
 [2] #cholesky!#91(::Bool, ::Function, ::Array{Float64,2}, ::Val{false}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.0\LinearAlgebra\src\cholesky.jl:182
 [3] #cholesky! at .\none:0 [inlined] (repeats 2 times)
 [4] #cholesky#95 at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.0\LinearAlgebra\src\cholesky.jl:275 [inlined]
 [5] cholesky at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.0\LinearAlgebra\src\cholesky.jl:275 [inlined] (repeats 2 times)
 [6] Type at C:\Users\Arnav Sood\.julia\packages\PDMats\mL7bX\src\pdmat.jl:16 [inlined]
 [7] MvNormal(::Array{Float64,1}, ::Array{Float64,2}) at C:\Users\Arnav Sood\.julia\packages\Distributions\WHjOk\src\multivariate\mvnormal.jl:208
 [8] top-level scope at none:0

#14

GIven what you have here I can’t reproduce. How did you get x_hat_F and Σ_F?

The error says that Σ_F is not Hermitian. Have you tried

MvNormal(x_hat_F, Symmetric(Σ_F))

#15

For normal random variables with more forgiving constructors I also have https://github.com/mschauer/GaussianDistributions.jl


#16

Thanks guys, the Symmetric() call worked, but it’s good to be aware of that package.

@jkbest2 we got them from a series of matrix calculations, but I thought this kind of thing was Markov (it matters what the object is, not how we produced it).


#17

I would like to get rid of PDMats.jl altogether for distributions, and just use factorizations, see https://github.com/JuliaStats/Distributions.jl/issues/688


#18

@simonbyrne are you suggesting that one would pass the cholesky factorization (or other factorization) to MvNormal as opposed to Sigma?


#19

You’re right of course that only the objects matter. I was curious because the error said Σ_F was not Hermitian, but it printed as Hermitian in the REPL.


#20

Yes: we could also allow passing a Matrix, but have the constructor convert it to a Cholesky (or some other factorization).