Tolerance specification for pivoted Cholesky

dmbates · November 3, 2017, 6:27pm

In this notebook I describe evaluating the rank of a model matrix X by forming a pivoted Cholesky factorization of X'X. I use this form because the X'X matrix will be evaluated whether or not the rank is needed.

I would appreciate if those who know the numerical linear algebra involved could tell me if my discussion of the tolerance is reasonable.

dlfivefifty · November 3, 2017, 9:19pm

Your link isn’t rendered correctly (use nbviewer), but realistically if you want a response you should post the code directly

Ralph_Smith · November 4, 2017, 5:58pm

You have:

As explained in the documentation for dpstrf, this value [tol] should be negated
to be used as a relative tolerance. Otherwise it is used as an absolute
tolerance which usually is not what you want.

This is not quite right. If tol is less than zero, dpstrf uses the recommended default value n*eps()*maximum(diag(A)), where A is n \times n, (which is relative, but not scaled by tol). Experiments showed that this yields stable factorization and accurate rank estimates for tested matrices where the condition number of the positive-definite projection is less than about 1/(n*eps()). If the condition is worse, both factors and rank estimate are questionable IIUC.

Thus your suggestion of \sqrt \epsilon is perhaps too conservative.

Nosferican · November 5, 2017, 7:56am

Usually one cares more about identifying linear dependent variables in the model matrix rather than just the rank. This way, if there are linearly dependent variables, these can be dropped such that one obtains a model matrix that is full rank. If the approach to solving the problem is through inversion of the normal matrix, I believe the best method would be.

Compute the QR decomposition of X.
The diagonal elements of R identify linearly independent variables as non-zero elements and linearly dependent as those with value zero.
Then one can compute the Cholesky decomposition of the normal matrix… For calculating the Gram matrix one can use R^{\top}R.

The QR decomposition is a preferred approach to singular-values decomposition (which rank uses) and has can be tailored to have similar tolerances.

Edit:
Actually I was thinking from a perspective of using iterative solvers. If you are using the Cholesky decomposition for solving the problem, then it is better not to do the QR decomposition as it is \mathcal{O}\left(m n^{2}\right) rather than Cholesky which is \mathcal{O}\left(n^{3}\right). The R in the QR decomposition is the same as the upper triangular of the Cholesky factor from the Gram matrix. You can identify which columns are linearly independent checking which diagonal elements of the L are non-zero.

As an aside, I still need to verify if the R approach from the QR decomposition can be linked to the singular-values decomposition for calculating the effective degrees of freedom, (i.e., \text{Tr}\left[ X \left(X^{\top}X + \Gamma\right)^{-1} X^{\top}\right])

Tamas_Papp · November 5, 2017, 9:00am

This approach can be somewhat brittle (the estimates change discontinuously as a function of the data). In practice, using a vague but proper prior or hierarchical/multilevel models is more robust.

Nosferican · November 5, 2017, 9:09am

I use it mostly for instrumental variable estimators such as 2SLS or after transformations such as fixed effects. In those cases, it makes sense for the user to specify a model which for some estimations the model matrix has to be trimmed to a subset which is full rank.

dmbates · November 6, 2017, 7:26pm

@Ralph_Smith Thank you for the correction on my reading of the documentation.

The reason that I chose the square root of eps() is because the positive semidefinite matrix is constructed as X'X and it is rank deficiency of X that I want to detect. This also related to the next comment by @Nosferican

dmbates · November 6, 2017, 7:30pm

@Tamas_Papp I agree that the use of hierarchical/multilevel models or, more generally, mixed-effects models is more robust. This calculation is being performed within the MixedModels package whose purpose is to fit exactly those models.

I have been contacted by users who have some factors they are modeling as fixed effects terms that may also be rank deficient.

stevengj · December 2, 2024, 7:00pm

A post was split to a new topic: Tolerance for numerical-rank computation

Topic		Replies	Views
Singular exception when it shouldn't happen, tol for inv? Optimization (Mathematical) linearalgebra	1	1685	October 8, 2018
Tolerance for numerical-rank computation Numerics linearalgebra	9	258	December 3, 2024
Use QR factorization efficiently Performance linearalgebra	12	5287	January 27, 2018
About QR decomposition - linear algebra Numerics	12	2864	March 9, 2018
Different results for cholesky decomposition between 0.6.2 and 0.6.3 General Usage linearalgebra	8	1279	June 3, 2018

Tolerance specification for pivoted Cholesky

Related topics