Getting strange MV Normal samples based on sparse precision matrix

ElOceanografo · October 8, 2020, 8:15pm

I’ve noticed something odd while trying to generate Gaussian Markov random fields on an irregular mesh. In this model, a spatial field is a multivariate normal variable defined on the mesh nodes. The distribution is defined in terms of its precision matrix Q rather than its covariance \Sigma = Q^{-1}, since the Markov property means that Q is very sparse.

In theory, random samples can be generated efficiently from a precision matrix like so:

z = randn(size(Q,1))
x = cholesky(Q).U \ z

However, I’ve noticed that doing this with a sparse matrix Q produces bad samples–too spiky, without the correct correlation structure. In contrast, samples generated from the covariance matrix or the precision matrix converted to a dense matrix both look right (this precision matrix is defined such that the decorrelation scale should be about 20):

using DataFrames, CSV
using Random, SparseArrays, LinearAlgebra, Plots

Q_df = CSV.read("precision_matrix.csv")
point_df = CSV.read("mesh_points.csv")
Q = sparse(df.i, df.j, df.q)
Σ = Symmetric(inv(Matrix(Q)))

Random.seed!(2)
z = randn(size(Q, 1))

x = cholesky(Σ).L * z
p1 =surface(point_df.x, point_df.y, x, camera=(45, 80), title="Dense Σ")

xq = cholesky(Matrix(Q)).U \ z
p2 = surface(point_df.x, point_df.y, xq, camera=(45, 80), title="Dense Q")

xqs = cholesky(Q).U \ z
p3 = surface(point_df.x, point_df.y, xqs, camera=(45, 80), title="Sparse Q")

plot(p1, p2, p3, layout=(1,3), size=(1200, 400), colorbar=false)

The .csv files are available here:

This has to be some numeric issue with the sparse Cholesky and/or backsolve, right? Any ideas for what’s happening, or for fixes or workarounds?

juliohm · October 8, 2020, 9:30pm

@ElOceanografo do you have a reference on the precision matrix method? In GeoStats.jl’s LUGaussSim solver we implement the LU with the covariance matrix. The good thing about the covariance method is that it can be adapted for conditional simulation when data is available at specific points of the mesh.

ElOceanografo · October 8, 2020, 11:23pm

It’s constructed using the stochastic partial differential equation method in Lindgren et al. 2011, which I’ve been working on implementing in Julia.

I’ve actually used GeoStats to do conditional simulations for my work (thanks for the great package!). At the moment though I’m looking for ways to plug GPs into more elaborate probabilistic models (e.g. in Turing) and to scale them up to larger datasets. The SPDE/GMRF method is appealing for its scaling properties, both storage- and computation-wise.

juliohm · October 9, 2020, 3:35am

That is really nice! If you are working on a Gaussian solver for spatial variables, it would be great to integrate with the project. All the Gaussian solvers we have implemented currently live in GaussianSimulation.jl in case you feel like contributing your work.

andreasnoack · October 9, 2020, 9:24am

The sparse factorization uses pivoting so you’d have to apply the permutation vector as well. Try with

cholesky(Q).PtL' \ z

ElOceanografo · October 9, 2020, 4:02pm

That was it, thanks! I didn’t know that about the sparse version (though I guess it is in the docs, now that I look).

Topic		Replies	Views
Evaluating a MvNormal density with a sparse covariance matrix Performance question	1	183	May 1, 2023
Degenerate Multivariate Normal Sampling Statistics	1	392	October 28, 2023
[ANN] Random field generation with GaussianRandomFields.jl Community package , announcement	2	1668	February 1, 2018
Turing TypeErrror Issue Probabilistic programming	4	542	February 9, 2022
Random draws of multivariate normal with positive semi-definite covariance matrix Statistics linearalgebra	4	1084	March 26, 2022

Getting strange MV Normal samples based on sparse precision matrix

Related topics