Possible bug in KrylovKit.eigsolve

Author of KrylovKit here. For a symmetric matrix, the method that KrylovKit uses is known as the (implicitly restarted) Lanczos method, which is a special case of the Arnoldi method that explicitly uses the symmetric structure.

To answer your first questions. Clearly Julia or any other scientific language provides generic and stable methods to compute eigenvalues of symmetric 8 x 8 matrices, in fact it is provided via LAPACK. These methods compute all eigenvalues (although there are also specialized methods to compute a range of eigenvalues or specific eigenvalues in the case of symmetric matrices). All of these are accessible via Julia’s LinearAlgebra.eigen.

Once you have large problems of which you only want a few eigenvalues, you should know at least a bit about the kind of methods that you will use to solve this problem. Not to the extent that you know how to implement them, but at least such that you know the basic philosophy behind a method and when you can use it and when you can not. KrylovKit in particular uses methods based on Krylov subspaces, which means that you start from a given initial vector v, and generate a subspace spanned by v, A*v, A^2*v, A^3*v, and so forth. Within this subspace, an approximation to the eigenvalues and eigenvectors is sought. Generically, this works well for extremal eigenvalues, i.e. if you were to plot all the eigenvalues in the complex plane, Krylov methods are generically good (without shift-and-invert strategies) only for eigenvalues on the periphery of that set of points. For a hermitian / real symmetric matrix, these eigenvalues lie on the real line, and the extremal eigenvalues are the most positive and most negative ones.

What can go wrong is of course that v, if you were to decompose it into the basis of eigenvectors, has no component along a certain eigenvector. In that case, no component will be generated in the Krylov subspace, and so you can never recover that eigenvector in that Krylov subspace, even if it is for example the one of largest magnitude eigenvalue. This is mostly a theoretical concern, as numerical floating point errors will typically generate small (order machine precision) components which then grow in the Krylov subspace. However, for your example, if you start with a vector v which only has nonzero components in its first four rows, you will never generate components in the last four rows.

A related problem is degenerate eigenvalues. Suppose a certain eigenspace is two-dimensional. The starting vector has a certain component in that eigenspace, but that selects one direction in that eigenspace. A second linearly independent direction in that eigenspace cannot in theory not be generated in the Krylov subspace, and so in theory Krylov subspace methods cannot find degenerate eigenvalues. In practice, once the first eigenvector is well approximated and deflated, a second linearly independent component can again be generated from numerical noise, but it’s kind of bad practice to rely on this behaviour.

There is one aspect that sets KrylovKit apart from other packages such as Arpack. Packages like Arpack do not require that A is an explict matrix (that would often defeat the purpose of using them), but they do assume that all the vectors take the standard format of a Vector. Once the Krylov subspace v, A*v, A^2*v, … , A^k * v becomes an invariant subspace, this means that A^(k+1)* v is linearly dependent on the previous vectors, and so after orthogonalizing it with respect to the previous ones, the zero vector is obtained. Arpack would (could? I am not entirely sure it does this) then just continue with a new random vector, again orthogonalized against the already found invariant subspace, but this time yielding a new nonzero linearly independent direction.

However, KrylovKit does not assume even that vectors in the problem are standard Vector types. It does not even assume that they have a finite length (or thus, dimension), nor does it rely on being able to fill those vectors with random numbers or anything. This is useful in many cases. But as a consequence, once an invariant subspace is obtained, and A^(k+1) * v after orthogonalization has norm zero, there is no way for KrylovKit to automatically generate a new random direction. And thus, it just stops and prints a nice warning.

17 Likes