The Householder QR factorization (a variant of which is used by Juliaās qr
function) does not compute Q explicitly ā instead, it stores a sequence of āHouseholder reflectorā vectors that can be used to compute Q*x
(or Q'*x
) for any vector x
almost as quickly as if you had an explicit Q
matrix. See, for example, these notes.
When you call Matrix(F.Q)
to compute Q
explicitly, it basically has to multiply the Householder reflectors by the columns of the identity matrix I
. To get a rough estimate of the cost, you can look at the flop count (floating-point additions and multiplications), and indeed the flop count for computing Q
explicitly is similar to the flop count for the QR factorization itself for square matrices, and in fact is larger if the original matrix is very ātallā (non-square)!
To get a faster approach that avoids forming a Q
submatrix explicitly, you should realize that Q1 = Q[1:N, 1:nx]
(where Q
is N+nx
Ć N+nx
) is equivalent to
Q1 = Matrix(I, N, N+nx) * Q * Matrix(I, N+nx, nx) = P1 * Q * P2
where P1
and P2
are very sparse. The trick is to exploit this sparsity, and multiply the P
matrices by your other vectors rather than Q
. You eventually want to compute (Ļbis*Q1)*Q1'
. This becomes:
(Ļbis*Q1)*Q1' == ((Ļbis*P1) * Q) * (P2*P2') * Q' * P1'
The key is to realize that al these multiplications by P
are equivalent to extracting a submatrix or padding with zeros, but we can do this after we multiply by Q
or Q'
. That is, we are doing:
f(Ļbis, Q, N, nx) = ([([Ļbis zeros(size(Ļbis,1), nx)] * Q)[axes(Ļbis,1), 1:nx] zeros(size(Ļbis,1), N)] * Q')[axes(Ļbis,1), 1:N]
On my machine,
@btime f($Ļbis, $(F.Q), $N, $nx);
takes about 850Āµs and produces the same result as (Ļbis*Q1)*Q1'
, whereas QR factorization took about 12ms.
Ideally, this would be automated. For that to happen, F.Q[range1, range2]
would have to return a new kind of matrix-like object that, upon multiplication, implicitly pads its input with zeros and extracts range1
rows of the result (or vice versa for left-multiply or multiplication by the transpose). This is certainly possible to implement in Julia (in the same way that F.Q
itself is a matrix-like object that implicitly multiplies by the Householder reflectors). A PR for this functionality might be of interest.
PS. To get the first nx
columns of Q
, a faster way is to multiply F.Q * Matrix(I, size(F.Q, 1), nx)
, but I donāt know of similarly easy and efficient way to get the first N
rows of the first nx
columns. (The algorithm for the latter in principle exists, but it would have to be implemented.)