You beat me to posting this. This sort of construction shows that if A is diagonalizable with a repeated eigenvalue, then every neighborhood of A includes matrices that are defective. Every neighborhood also includes matrices with distinct eigenvalues. If you use a stable algorithm, you at best get eigenvalues and eigenvectors of a nearby matrix. So variation in the number of eigenvectors found is consistent with a stable algorithm. With higher multiplicity eigenvalues, being diagonalizable doesn’t help all that much. This is really more like trying to determine the Jordan form of A in which you need to make decisions about Jordan block sizes of matrices close to A. That is known to be quite challenging and LAPACK isn’t even trying to do that.
To further complicate things, you really don’t have a very strong standard of stability for algorithms for non-normal eigenvalue problems. What you can expect from LAPACK is eigenvalue/eigenvector pairs that achieve a small relative residual. That is r_i = Ax_i - \lambda_i x_i will be such that
\frac{\|r_i\|}{\|A\| \|x_i\|}
is small. If you define
E_i = \frac{1}{\|x_i\|^2} r_i x_i^T
then (A-E_i) x_i - \lambda_i x_i^T =0 with
\frac{\|E_i\|}{\|A\|} = \frac{\|r_i\|}{\|A\| \|x_i\|}
which is just the relative residual norm. (Note I’m assuming vector/matrix 2-norm). So a small relative residual gives you a small backward error: x_i and \lambda_i are the eigenvector/value of a matrix close to A. But this is the backward error for a single eigenvector/value. There is no guarantee that there is a single suitably small E such that you get all the eigenvalues and eigenvectors of A+E. If you try to construct it somewhat more generally from the residuals, you might choose
E = \begin{bmatrix} r_1 & r_2 & \cdots & r_n \end{bmatrix} X^{-1}
where X is the matrix of eigenvectors. You then have
(A-E)X = XD where D is the diagonal matrix of eigenvalues. But the possibility that X might be ill conditioned (or even numerically singular) prevents you from concluding that E is small.
I think this is a nice example of some of the tricky things that happen when working with non-normal eigenvalue problems. It is quite interesting that Matlab gets different results. I don’t think it’s likely that it could be using any algorithm that gives strong guarantees, given all the difficulties involved.
Weird stuff can even happen with symmetric matrices if you use an algorithm that doesn’t try to preserve symmetry. Last fall, I had a student run into a similar issue using Python on a symmetric matrix. He had a symmetric matrix with repeated eigenvalues and used Numpy’s eig
function (which uses the nonsymmetric algorithm) instead of the function eigh
, which is specialized for symmetric problems. I don’t recall how bad it was, but the resulting eigenvector matrix was very clearly not orthogonal. I think it might even have been numerically singular with a duplicated eigenvector as in this problem. It is perhaps worth noting that Julia’s eigen
function runs ishermitian(A)
so it can choose a more reliable method even if you don’t wrap A
with Hermitian
. Given how tricky this stuff can be, I think that’s a good call. My student was doing a class project where I had planned the project by writing my own code in Julia and I was surprised that he had troubles.