I want to create a multivariate normal distribution with the exponentiated-quadratic covariance matrix. This code works. But if I make more points by replacing the step .3 with a smaller number like .1 the isposdef assertion fails so I can’t create a covariance matrix with it. Why does it fail to be positive definite and how can I fix it while retaining the covariance structure I want?
A diagonal matrix with positive diagonal values is posdef (as the diagonal values are the eigenvalues). If such a matrix is slightly perturbed, because of continuity (this can be made rigorous), the eigenvalues change only slightly. Specifically, they remain positive which makes the matrix posdef.
On the flip side, take any matrix and increase the diagonal by a large positive amount and the eigenvalues will eventually all become positive. This is known in the relevant theorems as a diagonally dominant matrix.
Does this explanation help?
For the specific manipualtion above, adding a multiple k of unit matrix, adds this k to all eigenvalues (because for unit matrix all vectors are eigenvectors…). So, with a large k (larger than most negative eigenvalues of original matrix), this would turn all eigenvalues positive i.e. posdef matrix.
You might also consider simply not using a covariance function that is analytic everywhere (in particular, at the origin). That is a significant source of the numerical problems, and unless you have specific reason to believe that the process you are working with should be analytic, then it probably isn’t even the most sensible. Maybe a better default would be to use a Matern covariance where you have chosen the order to give you two or three mean-square derivatives. You can pick orders to do that and also give evaluations that don’t require a special functions library. In one dimension, an order of 5/2 might be a simple thing to try. Note that the order that gives the number of derivatives you want does depend on the process dimension, though.
I recognize that the squared exponential is an appealing function because it’s a form that comes up in a lot of places, but unless you have a specific reason to use it it seems crazy to me to literally compute an entire extra opnorm or eigvals of your un-perturbed matrix in order to use a covariance function that is rarely a good default choice anyway.
Of course, I don’t know your application and maybe you have a specific reason for your choice and stuff. Just mentioning this in case you sort of picked it because whatever and are open to trying something else.
Gershgorin circle theorem is one very rough way to prove this. If the diagonal increases but not the other elements, then the radius is fixed and the center of the circle is moved until the cercle does not cover negative numbers.