Eigenvectors of very large (non-sparse) symmetric matrices?

JedPhillips · April 23, 2021, 3:49pm

I’m trying to get the first 50 dominant eigenvectors of a huge symmetric matrix, of dimension 65000 x 65000. I’m doing this on a computer cluster, so there’s no RAM limitation. But it still takes forever. In fact I don’t even know how long – I let it run for 2 days and killed the job because it hadn’t finished yet. The code is simple:

ev = eigvecs(eigen(Symmetric(P), 64950:65000))

Where P is an {Int64} matrix.

Is there a faster function besides eigen() that I should be using here? This was the only one I could find that allowed me to specify the K largest (or smallest) eigenvectors, but perhaps others do. This also doesn’t seem to be running parallel, which would otherwise speed things up massively as well, as I likewise could split it across hundreds of cores if there is a way to parallelize.

Or, is there a more creative solution here?

PetrKryslUCSD · April 23, 2021, 4:14pm

Are you interested in the largest eigenvectors? If so, how about block power iteration?

rveltz · April 23, 2021, 5:04pm

I guess it is a sparse matrix. In any case, have you considered iterative methods? I use a lot KrylovKit.jl for computing egen elements. You also have IterativeSolver.jl

PetrKryslUCSD · April 23, 2021, 5:07pm

I think the point was that the matrix was not sparse.

JedPhillips · April 23, 2021, 5:17pm

Yes – I only want the first 50 eigenvectors with the largest corresponding eigenvalues. I’ve never heard of block power iterations before, is there a Julia routine for it?

And you’re correct – the matrix is not sparse. It’s a distance matrix, with zeros on the diagonal, but non-zero (strictly positive) everywhere else.

PetrKryslUCSD · April 23, 2021, 5:23pm

I have a Matlab code that would be easy to tweak for this purpose: https://github.com/PetrKryslUCSD/AETNA/blob/master/investigations/EigenvalueProblems/bpwr2.m
I haven’t had time yet to convert that toolkit to Julia, sorry.

stevengj · April 23, 2021, 5:58pm

You might as well use restarted Lanczos or some other Krylov algorithm, no? That should be strictly better than block-power iterations.

Krylov methods aren’t limited to sparse matrices — you just need a way to multiply matrices by vectors. If you have a fast way to multiply matrices by vectors (either because of sparsity or some other structure, e.g. using fast-multipole methods), then they are faster, but you can also use dense matrices, especially if you want the largest-|λ| eigenvalues so that you don’t need shift-and-invert.

Where are you getting such a large dense matrix from? Maybe it has some other structure you can exploit. (At the very least you can parallelize the matrix-vector multiplication.)

JedPhillips · April 23, 2021, 6:53pm

This is a phylogenetic distance matrix for 60,000 animal species. The goal is to find the eigenvectors that best explain the phylogenetic dissimilarity among species. Unfortunately it’s inherently dense.

rveltz · April 23, 2021, 7:17pm

I completely second that.

stevengj · April 23, 2021, 8:18pm

It might be possible to form an approximate “compressed” representation of this matrix that you can multiply by quickly. For example, a lot of distance-like matrices have (approximate) hierarchical low rank structure or other combinations of low-rank structures.

andreasnoack · April 25, 2021, 3:32pm

It might also be worth trying this calculation with MKL.jl since MKL has a really really fast reduction to symmetric tridiagonal form. Particularly, if you happen to have a machine with many cores.

cgeoga · April 25, 2021, 7:57pm

as @stevengj suggests, your matrix may still have a lot of structure despite being dense. I have a package for H-matrices that implements a particularly simple hierarchical format called HODLR (“hierarchically off-diagonal low rank”). This of course only helps if there is structure to be exploited, and if you’re interested I (or somebody else) can expand on common sources of that structure.

On my machine (an intel i5-11600K with a completely broken BLAS-3 until openblas 3.14 comes to Fedora), in the case of a matrix that very much does have that structure, I can assemble a 60k square matrix in about 130ms. And for matrices that would actually fit in RAM, you could check that the matvec accuracy is quite good. Probably not quite tol (see below), but probably better than 10*tol. Here’s an example:

using KernelMatrices, KernelMatrices.HODLR, BenchmarkTools

# this could also have the syntax kernel_entry(j, k, parameters), or
# kernel_function(ptj, ptk, parameters), where you'd construct K with a vector
# of points instead of the integer indices.
function kernel_entry(j,k)
  exp(-abs(j-k)/100)
end

K = KernelMatrix(1:60_000, 1:60_000, kernel_entry)

tol     = 1e-14 # ACA fact precision
maxrank = 0     # maximum rank
level   = HODLR.LogLevel(8) # log2(n) - 8 dyadic 2x2 splittings

# 130 ms with a single thread on my computer. Not too bad. 
# if you want a direct solve, you can also factorize this matrix,
# and assuming that the off-diagonal block rank is bounded then the 
# factorization and solve will be O(n log^2 n).
@btime HK = KernelHODLR($K, $tol, $maxrank, $level, nystrom=false)

The package really depends on being able to conveniently get individual elements of your matrix, though. So if you have something that looks like \textbf{X} \textbf{X}^T or something, unless \textbf{X} is pretty skinny then this might not be super helpful. But maybe something to look into.

Topic		Replies	Views
Fast calculation of eigenvalues Numerics	6	3970	October 28, 2019
Special method for finding eigenvalues of a large sparse symmetric matrix? Numerics	17	6236	May 29, 2019
Distributed eigensolvers for computing eigenpairs of large sparse matrices Numerics question , package , parallel , distributed , eigenvalues	6	605	September 28, 2022
Finding all the eigenvalues of a sparse matrix? Numerics linearalgebra , sparse , eigenvalues , matrix	25	1473	July 18, 2024
Need to find out all eigenvalues and vectors of large Hermitian matrices Numerics	1	298	October 17, 2023

Eigenvectors of very large (non-sparse) symmetric matrices?

Related topics