Large Sparse Hermitian Matrix - Intermediate Eigenvalue Problem

Hi,

I am trying to find eigenvalues and eigenvectors for a rather large, sparse, hermitian matrix (size ~1E7 x 1E7). I am mainly interested in around 40 eigenvalues in the middle of the spectrum.

To solve this originally, I am using the ARPACK library, which works fine with the shift-invert, I get correct eigenvalues and eigenvectors, but this takes a very long time. I have been using previously a JDQR MATLAB solver on my local machine for smaller problems. A similar implementation of this I have seen in Julia in the JacobiDavidson package. I have installed this and begun testing, but I can’t seem to figure out how the output from the JDQR solver itself is structured. I have looked through the github and haven’t been able to decipher anything useful :frowning: (all examples are given for the JDQZ routine which is throwing me a bit.)

Has anyone gotten this library to work for them? Or is there any other library I should be looking at for this problem.

Here is a snippet of what I am running:

// S is a sparse matrix read in from a file and assigned to A, column 1 is the row number, column 2 is the column number, column 3 and 4 are the real and imaginary parts of the value at a given row and column number. //

A=Hermitian(sparse(Int.(S[:,1]), Int.(S[:,2]), complex.(S[:,3], S[:,4])))
Aop = LinearMap(A)

target = Near(complex(dE, 0.0)) // Near requires a complex value, all eigenvalues should be real.

n = size(A, 1)
levels = 20
solver = BiCGStabl(n, max_mv_products = 10, l = 2)

pschur, residuals = jdqr(Aop;
solver = solver,
target = target,
pairs = levels,
tolerance = 1e-8,
subspace_dimensions = levels+5:2*levels+10,
max_iter = 25000,
verbosity = 1
)

Basically, I am confused with the format of pschur, and how I get my eigenvalues and eigenvectors from this.

Any help is appreciated!

Is it possible for you to share the matrix? I think one of the most efficient ways would be to compute the action of the inverse of the shifted matrix (A-\sigma I)^{-1} on a vector x, rather than including the entire matrix inside a LinearMap (this is explained in this post),

If you can share the matrix, I can try to give you an example for this case since I have been doing something similar these past weeks.

2 Likes

Thank you for the reply and useful link, I shall give that one a try!

I have attached a much smaller system for you to take a look at. The outline of the file is as written in my initial post.

Dropbox Matrix

A bit of context for the problem, I am running a finite difference method solver for an 8 band k.p model in 3D. The exact problem we are trying to solve requires a very small step size - hence the large matrix.

Is it symmetric real? Then perhaps PetrKryslUCSD/VibrationGEPHelpers.jl: Generalized Eigenvalue Problem helper functions might be of help?

Assuming the matrix A is loading into memory, you can try the following code snippet which uses ArnoldiMethod.jl, a native julia implementation:

struct ShiftAndInvert{TA}
    A_factor::TA
end

function shiftinvert_operator(A, σ)
    n = size(A, 1)
    T = LinearMap{eltype(A)}(ShiftAndInvert(lu(A - σ * I)), n; ismutating=true)
    return T
end

function (M::ShiftAndInvert)(y, x)
    (; A_factor) = M
    ldiv!(y, A_factor, x) # computes y = (A - σI)⁻¹ x reusing the factorization
end


σ = 2.0
T = shiftinvert_operator(A, σ)

decomp, history = partialschur(T; which=:LM, nev=10, tol=1e-10)
history.converged || error("The eigenvalue solver did not converge.")
μ, Vs = partialeigen(decomp)

# λ of the original problem are λ = 1 / μ + σ
λ = 1 ./ μ .+ σ

For your bigger matrix, an issue could be that you run out of memory in the factorization step (which here I just chose lu for simplicity, since I saw that the matrix is not positive definite for cholesky). If this is the case, I had good experiences in the past with the sparse factorization of Pardiso, which should be included in the construction of the LinearMap, and will be later used by the eigenvalue routine for the matrix-vector products.

It’s not symmetric real unfortunately, just Hermitian! Thanks for the idea though!

1 Like

I have given a bit of time to testing a few of the different ideas you have mentioned.

  1. ArnoldiMethod.jl (as elucidated in the documentation) provides similar performance to ARPACK.jl, in my testing it seems ArnoldiMethod is a little bit faster.

  2. I tested out the shift invert method you provided (thank you for that by the way). While it worked well, I ran into memory issues quite quickly.

  3. Using the Pardiso.jl library I was able to get 10 eigenvalues in ~10 mins from a 500,000x500,000 sparse matrix test case. I seem to now be running into memory issues again when I increase the size of my system further (now ~4,000,000x4,000,000). Max memory required by Pardiso is 440GB (HPC node only has 400GB available).

A stripped back version of my current set up is seen below:

using DelimitedFiles
using LinearAlgebra
using SparseArrays
using LinearMaps
using ArnoldiMethod
using Pardiso

struct ShiftAndInvertPardiso
    ps::MKLPardisoSolver
    A_pardiso::Any
end

"""
    shiftinvert_pardiso_operator(A, σ; matrixtype=:COMPLEX_HERM_INDEF)

Constructs a LinearMap for (A - σI)⁻¹ using Pardiso factorization.
"""
function shiftinvert_pardiso_operator(A::SparseMatrixCSC, σ;
                                      matrixtype=:COMPLEX_HERM_INDEF)

    n = size(A, 1)
    # Init Pardiso solver
    ps = MKLPardisoSolver()
    set_matrixtype!(ps, getfield(Pardiso, matrixtype))
    pardisoinit(ps)
    fix_iparm!(ps, :N)

    # Shifted matrix
    A_shifted = A - σ * I

    # Register and factorize
    A_pardiso = get_matrix(ps, A_shifted, :N)
    btmp = zeros(eltype(A), n)

    set_phase!(ps, Pardiso.ANALYSIS)
    pardiso(ps, A_pardiso, btmp)

    println("Estimated mem (KB): ", get_iparm(ps, 15))
    println("Permanent mem (KB): ", get_iparm(ps, 16))
    println("Peak mem (KB): ", get_iparm(ps, 17))
    flush(stdout)

    set_phase!(ps, Pardiso.NUM_FACT)
    pardiso(ps, A_pardiso, btmp)

    # Return LinearMap + solver handle
    linmap = LinearMap{eltype(A)}(
        ShiftAndInvertPardiso(ps, A_pardiso),
        n; ismutating=true
    )
    return linmap, ps
end

# Apply operator y = (A - σI)⁻¹ x
function (op::ShiftAndInvertPardiso)(y, x)

    set_phase!(op.ps, Pardiso.SOLVE_ITERATIVE_REFINE)
    pardiso(op.ps, y, op.A_pardiso, x)
end


# Start of Solver Routine
dE = 0.2
levels = 8

S=readdlm("matrix.txt");

A=sparse(Int.(S[:,1]), Int.(S[:,2]), complex.(S[:,3], S[:,4]))

# Pardiso Commands
T, ps = shiftinvert_pardiso_operator(A, dE; matrixtype=:COMPLEX_HERM_INDEF)
decomp, history = partialschur(T; which=:LM, nev=levels, tol=1e-8)
history.converged || error("The eigenvalue solver did not converge.")
μ, V = partialeigen(decomp)

# λ of the original problem are λ = 1 / μ
D = 1 ./ μ

# Cleanup
set_phase!(ps, Pardiso.RELEASE_ALL)
pardiso(ps)

Am I better off swapping to an iterative method rather than a direct method like this?

Yes, in my example I just put lu to give you the general idea, although for larger matrices Pardiso achieves much sparser factorizations that can also exploit the hermiticity of your problem (while LU does not).

Keep in mind that you are actually using an iterative method with ArnoldiMethod. But, to compute the action of the shift-invert operator on a given vector, you factorize a matrix, which is the expensive step in memory.

Therefore, yes, you may need to avoid the factorization in large problems, although passing the matrix directly to the iterative methods would yield much longer computational times as you had in the beginning because the matrix-vector operation would be much more expensive than when you do the factorization.

I do not know very well other alternatives (I read a while ago that there are multigrid methods that work well for eigenvalues coming from an elliptic operator), so hopefully somebody else can point you out to a better alternative for very large sparse arrays.

1 Like

For the shift-invert approach using direct factorization, you might also see whether the ldlt factorization function in SparseArrays (which uses a CHOLMOD routine) succeeds for your matrices, possibly using less memory than Pardiso.

With regard to the Jacobi-Davidson scheme, the partial Schur decomposition in the Hermitian case is the same as a partial diagonalization, so pschur.Q holds the approximate eigenvectors and pschur.values the corresponding eigenvalues.

Note that the JacobiDavidson package does not currently take proper advantage of symmetry. It may still be effective in case direct factorization is impractical. For this approach, I suggest that you find a good preconditioner, which requires domain knowledge (you may be the only one here with experience in \vec{k}\cdot\vec{p} Hamiltonians). If the scheme looks promising but you need better performance from a Hermitian-tailored library, I have a wrapper for the PRIMME library - but be forewarned, the interface is very complicated.

1 Like