Also, since you computing the SVD of ATQ , which is presumably a small matrix even if A is large and sparse, wouldn’t you just use the dense SVD on this?
The whole idea came from the RSVDPACK paper, chapter 3.4. The claim is that for very large matrices, eigen(A' * A)
is much faster than svd(A)
. And this is easy to check:
A = randn(100000, 1000);
@time svd(A);
@time eigen(A' * A);
20.607551 seconds (12 allocations: 1.527 GiB, 0.16% gc time)
2.657308 seconds (17 allocations: 30.877 MiB)
And it gets worse for larger matrices.
As for the conditioning problem that’s what they write:
A disadvantage of using this method is that the matrix BB^∗ has essentially the square of the condition number of B. As a result, very small singular values of A near machine precision may not be properly resolved. This is an issue only if A is expected to have very small singular values amongst σ_1, . . . , σ_k.
That’s a bit concerning, but at the end, svd
has really bad asymptotic from the matrix size, and we need to run it on A' * Q
, which has low number of columns, but as many rows as A
. Given that I personally need the method to work with matrices of >10M rows, using normal svd
is not an option…
Even for sparse matrices, ATA squares the condition number. I’m not sure why you consider KrylovKit a “heavy dependency” and not IterativeSolvers.jl?
IterativeSolvers just doesn’t have any real dependencies, and KrylovKit
has more of them, including GPUArraysCore
. Though it’s a minor thing.
How does your function compare to calling KrylovKit.svdsolve
directly on A ?
On the synthetic example A = randn(100000, 1000)
for 50 top values, eigSVD
returns the same eigenvalues as svdsolve
or eigsolve(A' * A, ...)
, but eigSVD
takes 2.5s, eigsolve
1.7s, while svdsolve
takes 74s. Surprisingly, svdsolve
is much slower than full svd
.
However, on these tests I observed that IterativeSolvers.lobpcg
sometimes fails, while eigsolve
behaves much stabler. So I guess I’ll switch to KrylovKit.eigsolve
at the end.
Note also that your use of randn
means that you will automatically promote to Float64
. Better to use randn(eltype(A), n, k+s)
so that it uses the type of A
.
While I see how it helps for consistency, I’m not sure that would be correct in this case. When we do multiplication of large matrices, it’s a lot of sums inside, which can lead to overflowing. As an example, below values in A
are in the range of 0-1000, Q
are standard normal, but A' * Q
consists fully of Inf
. And if we use low-precision floats to save memory, type promotion makes total sense to me: A
is large, so it’s Float16
, but Q
and A' * Q
are much smaller, so we can afford to make them Float64
. Same logic as with sparsity.
A = rand(Float16, 100000, 200) .* 1000;
Q = randn(eltype(A), size(A, 1), 10);
(A' * Q)[1:2, 1:5]
2×5 Matrix{Float16}:
Inf Inf Inf -Inf Inf
Inf Inf Inf -Inf -Inf