Given a matrix A, I want to compute a matrix with rank(A) column vectors that span the range of A. With an SVD, this can be done with:

import LinearAlgebra as LA
function ortr_svd(A::AbstractMatrix{T}) where {T<:Number}
dec = LA.svd(A)
tol = maximum(dec.S) * maximum(size(A)) * eps(T)
rank = sum(dec.S .> tol)
dec.U[:, 1:rank] # The first `rank` columns of U span A.
end

Because this does not work with sparse matrices, I want to find an alternative using a QR decomposition. My attempt is:

function ortr_qr(A::AbstractMatrix{T}) where {T<:Number}
dec = LA.qr(A, LA.ColumnNorm())
tol = maximum(abs.(dec.R)) * maximum(size(A)) * eps(T)
rank = sum(abs.(LA.Diagonal(dec.R)) .> tol)
dec.Q[:, 1:rank]
end

This seems to work when A is dense, but otherwise I get the error

ERROR: Pivoting Strategies are not supported by `SparseMatrixCSC`s

Is there any workaround? For instance, can I do it without any pivoting, or can I make pivoting work on sparse arrays?

See also: How to find the linearly independent columns (rows) of a matrix - #14 by stevengj (although this wasn’t written with sparse QR in mind). This returns a list of which of the columns of A are independent (up to some tolerance), and you can then use those columns as a basis — this might be a good approach to adapt for sparse A since its columns are sparse (albeit not orthonormal).

What sizes and ranks of sparse matrices are you interested in?

@stevengj In fact I had read your post before, but got stuck in adapting it to sparse matrices because the qr does not allow for pivoting if the matrix is sparse. Together with the answer by @abulak, this can be solved by getting the permutations from the rpivinv field of a SparseArrays.SPQR.QRSparse.

I expect usage with matrices of quite varying sizes (order thousands to tens of thousands) and the rank should often be around 1/5 of the dimension.

@abulak I had previously attempted something quite similar, but I was using F.prow instead of F.rpivinv for the rows permutation vector, and this somehow was not working. Indeed, these two fields are not equal, but I cannot seem to find the difference between them. Would you happen to know this?

These are inverses of each other – the difference is on which side you’re permuting rows. F.prow is permuting rows of A, its inverse will be permuting rows of Q. For solving linear equations with \ the prow is more useful. For our purpose we need the correct rows of Q.

Btw. every time when you ask for F.prow you’re inverting F.rpivinv (which you can also access like this, no need to getfield(F, :rpivinv).