I am trying to generate random draws from a multivariate normal distribution with positive semi-definite covariance matrices. The Distributions.jl MvNormal only handles positive definite covariance matrices and throws an exception otherwise. There was a discussion about adding support for semi-definite covariance matrices a few years ago here but it is still not supported. While MvNormal does much more than generating random numbers, I am only interested in the random number generation.

My current solution is to do a pivoted cholesky decomposition and multiply by independent standard normals:

mu = [0,0]
Sig = [1 1;1 1]
chol = cholesky(Sig,Val(true),check=false)
random_draw = mu + chol.L[invperm(chol.p),1:chol.rank]*randn(chol.rank)

I check that the eigenvalues of Sig are non-negative before I generate the draws: minimum(eigvals(Sig)) >= 0. This has worked properly in a few problems I have tested.

I would appreciate any feedback on my approach. Is there an easier and/or more efficient way to achieve the same results? Even better, is there any package that already generates multivariate normals with positive semi-definite covariance matrices?

The problem is that you may get slightly negative pivots due to roundoff errors, in which case LAPACK may terminate the factorization prematurely.

Ideally there would be a way to specify some tolerance so that slightly negative pivots could be treated as zeros, but I don’t know of any API for this?

I am using the check=false option to ensure that the cholesky decomposition does not stop with small negative pivots. But I need to check that the matrix is positive semi-definite before I can use this, otherwise I may get wrong results without any exceptions if I pass a matrix that is not positive semi-definite. I am using the eigvals(Sig) to check for positive semi-definiteness.

The sqrt is doing the eigendecomposition already. Is there any optional argument to make it throw an exception if the eigenvalues are negative beyond the tolerance? By default it returns a complex matrix.

I’m not sure this does what you think. Read the LAPACK docs in the linked thread and LAPACK technical report #161 section 7: my understanding is that the Cholesky factorization will stop early if it encounters a negative pivot. check=false just means that you get the partially factorized result with no error thrown.

Wait, I just noticed that you are only looking at columns 1:chol.rank of the L factor — I think you are right, it is okay in this case:

julia> B = randn(3,7); A = B'B; F = cholesky(A,Val(true),check=false);
julia> L = F.L[invperm(F.p),1:F.rank]; L*L' ≈ A
true

First, if this doesn’t guarantee that the Cholesky factorization won’t encounter a slightly negative pivot due to roundoff errors. Second, you could easily have a matrix that is theoretically semidefinite but in practice has slightly negative eigenvalues, again due to roundoff errors. For example: