SVD on a matrix that does not fit into memory

This is not the full SVD, but rather a rank-k approximation given by the k largest singular values.

If your desired k is small enough that k vectors do fit into memory (even though the whole matrix does not), then you could use e.g. KrylovKit.svdsolve or TSVD.tsvd, given only a function to multiply your large matrix by a vector.

To multiply a matrix that does not fit into memory, you need an out-of-core algorithm. I’m not sure if any are implemented in Julia already, but for matrix–vector products they would be pretty easy to implement yourself: just read the matrix into memory in chunks (e.g. chunks of rows), multiply each chunk by your vector, and then add up the results.

If you have a cluster, then you can use a distributed-memory linear-algebra package like Elemental.jl, or use an iterative package like KrylovKit or TSVD with DistributedArrays.jl.

(Nowadays, I think most people trying to solve this sort of problem would use a cluster or other parallel system rather than using out-of-core algorithms with filesystem storage.)

4 Likes