# Copying a QR decomposition matrix Q is extremely slow

This is on julia 1.5.3

``````julia> f = qr(ones(100, 20), Val(true));

julia> copy(f.Q); @time copy(f.Q);
0.057310 seconds (40.00 k allocations: 338.516 MiB, 10.85% gc time)

julia> copy(Array(f.Q)); @time copy(Array(f.Q));
0.000074 seconds (7 allocations: 84.969 KiB)

julia> typeof(copy(f.Q)) == typeof(copy(Array(f.Q)))
true
``````

The `copy` appears to be called in `svdvals` which slows to a crawl for large matrices. Is this a known issue?

It appears that `copy(f.Q)` relies on a default method for `AbstractArray`s, which might explain the difference in performance. Is this something that may be improved?

Upon further introspection it appears that indexing into `Q` is slow.

``````function getindex(Q::AbstractQ, i::Integer, j::Integer)
x = zeros(eltype(Q), size(Q, 1))
x[i] = 1
y = zeros(eltype(Q), size(Q, 2))
y[j] = 1
return dot(x, lmul!(Q, y))
end
``````

results in

``````julia> @btime \$f.Q[1, 1];
22.795 μs (6 allocations: 48.84 KiB)
``````

As a consequence

``````julia> @time f.Q[:,:];
30.861779 seconds (4.00 M allocations: 46.529 GiB, 2.50% gc time)
``````

I’m definitely not an expert here, but is that because the`q` isn’t present as a dense matrix until you copy it. It’s stored in packed form as a product of Householder reflectors. Do you actually a dense copy? You can multiply by it while still in packed form.

2 Likes

@malacroi is correct. Extracting Q from the data structure you get from qr or qr! is expensive. IIRC, it’s O(N^3) work. If you just want to multiply D by Q’, for example, then `f.Q'*B` is what you want.

For example

``````julia> A=rand(1000,1000); AC=copy(A); AF=qr!(A);

julia> C=AF.Q*AF.R; norm(C-AC)
4.49007e-13

julia> D=AF.Q'*AC; RC=copy(AF.R); norm(D-RC)
1.27313e-13
``````

The copy of AF.R is cheap because you don’t have to compute anything new. Not so if you do AF.Q.

1 Like

The fast way to extract the `Q` factor as a dense matrix is just what you already do in your first example, namely `Array(f.Q)`. Indeed, `copy` will probably fall back on a general method that uses indexing, which is slow for the way `f.Q` is stored/represented.

If you want to make a copy of an array or some object that behaves as it, but is maybe not explicit (like a Q factor, or a transposed matrix), and you want the result to be a explicit `Array` that has fast indexing and its own memory, then `Array(object)` is a good way.

1 Like

Looks like we’re missing a specialized `copy(x::QRPackedQ) = Array(x)` method for performance. But maybe that method should even return a `QRPackedQ` object, since `copy` doesn’t have to return an `Array`. You could file an issue in GitHub.

`C=Array(AF.Q)` is ok if you must have Q. It is far better than copy. The allocation burden seems a bit steep for the 8MB you need to store Q.

``````julia> A=rand(1000,1000); AF=qr!(A);

julia> @btime C=Array(\$AF.Q);
31.310 ms (8 allocations: 22.89 MiB)
``````

Currently `svdvals` relies on a copy.

``````svdvals(A::AbstractMatrix{<:BlasFloat}) = svdvals!(copy(A))
``````

I realize that I may evaluate `svdvals(Array(Q))` instead of `svdvals(Q)`, however is it reasonable to expect `svdvals` to perform this by itself? This is a different question from whether we should implement a specific method for `copy(::AbstractQ)`, as — from the suggestions above — `copy` might not be be the best choice for certain types of matrices.