Accelerating repeated eigenvalue searches

Haven’t read the whole thread, but LOBPCG does this. There’s an implementation in iterativesolvers.jl, and we also have one in DFTK.jl/src/eigen/lobpcg_hyper_impl.jl at master · JuliaMolSim/DFTK.jl · GitHub

2 Likes

I’ll have a look! Perhaps unsurprisingly, my use case is also about computing bandstructures :wink:

@antoine-levitt I’ve had a chance to try IterativeSolvers.lobpcg!. Without reusing vectors, it is 2 to 3x slower than KrylovKit and ArnoldiMethod. With reusing the vectors from the previous k-point, it roughly halves the number of iterations needed to converge (~15 instead of ~30), making it comparable, but typically somewhat slower.

I thought that perhaps instead of just reusing vectors (feeding V_{n-1}, the matrix of eigenvectors of the previous k-point) I instead try to extrapolate to the new k-point, i.e. feed into lobpcg as initial guess 2 V_{n-1} - V_{n-2}, would lead to better convergence. However, this actually hurt convergence and returned me to the ~30 iterations as with not reusing vectors at all. Any other ideas how to accelerate that convergence?

Since KrylovKit.jl does not have block implementations, you can only pass one starting vector. You could add a linear combination of the previous eigenvectors. If you would do this without changing the matrix, the individual eigenvectors would be reconstructed from the linear combination in a number of function applications equal to that of the number of vectors involved.

2Vn-1 - Vn-2 is not great as it would not be able to accomodate band crossings and the like. If you want to use the information of Vn-2, a better solution is to do a rayleigh-ritz method (diagonalize the operator in the basis formed of the union of Vn-1 and Vn-2) if you can afford it, although I’m not sure how much iterations it will buy you. Or even pass to LOBPCG more vectors than you actually need, and setting the stopping criterion to only look at the ones you actually need (I don’t know about IterativeSolvers but our solver has an option for this)

This is called a numerical continuation problem, and in related problems (tracking nonlinear eigenvalues for lasing modes) we’ve had good results with BifurcationKit.jl (which can do higher-order extrapolation and handle certain kinds of singularities); I haven’t tried it for band structures, though.

(But you still have to be careful of band crossings, i.e. where you are computing N bands but the N+1-th band crosses the N-th band, as @antoine-levitt discusses above.)

1 Like

Presumably for band structures it’s a complex operator with arbitrary phase on the eigenvectors, so if you want to extrapolate you’d need at least to align the vectors before, ie find the best unitary R such that Vn-2 R is closest to Vn-1 (which you do by an svd of the overlap between Vn-2 and Vn-1)

Note that for LOBPCG you should only need to extrapolate the subspace, i.e. you should only need to extrapolate to a set of vectors that nearly spans the new eigenvectors.

But for extrapolating from multiple k points I agree that you need to be careful about choosing the bases. One possibility is described in https://onlinelibrary.wiley.com/doi/abs/10.1002/pssb.202000260

Also https://doi.org/10.1021/acs.jctc.0c00684

@antoine-levitt @stevengj Fair, I did not consider that the phase might be an issue. For the low order extrapolation I was proposing, would simply fixing the phase of each basis vector, i.e. enforce Vn[1,:] are real, solve most of the problem? (Admittedly, this does not deal with the arbitrary superposition of degenerate eigenvectors. I am dealing with a low-symmetry case, so I expect at most doublets. Perhaps @stevengj 's comment on needing only extrapolated superposition applies.) Otherwise, could you be a bit more explicit about the SVD approach you outlined?

Probably, as long as the first component doesn’t vanish. A better way is to align them, ie find the phase phi such that vn-1 phi is closest to vn-2. This comes out as vn-1 becomes vn-2 <vn-2, vn-1>/|<vn-2, vn-1>| (or something like this).

In the general multivector case, phi becomes a unitary matrix which can be obtained from an SVD of the overlap matrix between Vn-1 and Vn-2. I’m sure it’s very standard but I don’t know offhand of a place where it’s written down properly…

Would this be it?

Yep!