The Householder QR factorization (a variant of which is used by Julia’s
qr function) does not compute Q explicitly — instead, it stores a sequence of “Householder reflector” vectors that can be used to compute
Q'*x) for any vector
x almost as quickly as if you had an explicit
Q matrix. See, for example, these notes.
When you call
Matrix(F.Q) to compute
Q explicitly, it basically has to multiply the Householder reflectors by the columns of the identity matrix
I. To get a rough estimate of the cost, you can look at the flop count (floating-point additions and multiplications), and indeed the flop count for computing
Q explicitly is similar to the flop count for the QR factorization itself for square matrices, and in fact is larger if the original matrix is very “tall” (non-square)!
To get a faster approach that avoids forming a
Q submatrix explicitly, you should realize that
Q1 = Q[1:N, 1:nx] (where
N+nx) is equivalent to
Q1 = Matrix(I, N, N+nx) * Q * Matrix(I, N+nx, nx) = P1 * Q * P2
P2 are very sparse. The trick is to exploit this sparsity, and multiply the
P matrices by your other vectors rather than
Q. You eventually want to compute
(ψbis*Q1)*Q1'. This becomes:
(ψbis*Q1)*Q1' == ((ψbis*P1) * Q) * (P2*P2') * Q' * P1'
The key is to realize that al these multiplications by
P are equivalent to extracting a submatrix or padding with zeros, but we can do this after we multiply by
Q'. That is, we are doing:
f(ψbis, Q, N, nx) = ([([ψbis zeros(size(ψbis,1), nx)] * Q)[axes(ψbis,1), 1:nx] zeros(size(ψbis,1), N)] * Q')[axes(ψbis,1), 1:N]
On my machine,
@btime f($ψbis, $(F.Q), $N, $nx);
takes about 850µs and produces the same result as
(ψbis*Q1)*Q1', whereas QR factorization took about 12ms.
Ideally, this would be automated. For that to happen,
F.Q[range1, range2] would have to return a new kind of matrix-like object that, upon multiplication, implicitly pads its input with zeros and extracts
range1 rows of the result (or vice versa for left-multiply or multiplication by the transpose). This is certainly possible to implement in Julia (in the same way that
F.Q itself is a matrix-like object that implicitly multiplies by the Householder reflectors). A PR for this functionality might be of interest.
PS. To get the first
nx columns of
Q, a faster way is to multiply
F.Q * Matrix(I, size(F.Q, 1), nx), but I don’t know of similarly easy and efficient way to get the first
N rows of the first
nx columns. (The algorithm for the latter in principle exists, but it would have to be implemented.)