I checked just compute x[QR.prow]
is also slow.
Thanks for your suggestion. The suggested method works for full-rank case but I am developing an algorithm for rank-deficient case which requires the former method. I think A\x
dispatch to qr(A)\x
which computes a basic least square solution (not for rank-deficient case).
Also in some case A::Adjoint{<:Any, <:AbstractSparseMatrix}
and due to memory issue I cannot materialize the adjoint (OutOfMemoryError with sparse A'*A) and has to work with orthogonal projection via A.parent
.