How to find the linearly independent columns (rows) of a matrix

To expand on the potential problems with QR factorization with column pivoting that @stevengj alluded to, the standard example is due to Kahan. The following code generates the matrix and compares the results of QR factorization and the SVD:

using LinearAlgebra

n=100
c = 1.0
s = 2.0
c, s = [c,s] / norm([c,s])
C = triu(-c * ones(n,n), 1) + I
S = Diagonal([s^(j-1) for j in 1:n])
K = S*C
F = qr(K, ColumnNorm())
@show F.R[n,n]
@show minimum(svdvals(K))

On my system this gives output

F.R[n, n] = 1.5957114308101243e-5
minimum(svdvals(K)) = 3.615747500695125e-21

So the matrix is closer to a rank deficient matrix by many orders of magnitude than is revealed by R_{nn}. The example is somewhat delicate in that it is constructed so that no pivoting should occur, but depending on rounding the algorithm might pivot anyway. For example, starting with s=2.5 prior to normalization, QR with pivoting did actually do some pivoting and achieved results comparable to the SVD when I ran it. Despite improvements to the LAPACK pivoted QR routines in other respects, the algorithm can still fail to detect near rank deficiency on Kahan’s matrix. However, while imperfect, it is considered a reasonably reliable way to identify near rank deficiency in most cases. The SVD is always the most reliable, but QR with column pivoting isn’t very likely to be off by many orders of magnitude unless you have set out to break the algorithm or are very unlucky.

For LU factorization, as already noted, partial pivoting is not at all reliable for this. (It also breaks on Kahan’s matrix and on many other simpler examples.) LU with complete pivoting is much better and you can prove bounds on the size of pivots when factoring a nearly rank deficient matrix. But the bounds are no better than for QR and it’s also not 100% reliable. Also, as far as I know Julia doesn’t provide access to the LAPACK routines for LU with complete pivoting, so that option is not very convenient.

4 Likes