Eigen fails to find eigenvectors

I am confused. I’m finding the eigenvalues and eigenvectors of a 3x3 matrix, but it seems like eigen does not find all the vectors correctly.

julia> using LinearAlgebra

julia> A = [1 1 1; 1 1 1; -1 -1 -1]
3×3 Matrix{Int64}:
  1   1   1
  1   1   1
 -1  -1  -1

julia> D, Q = eigen(A)
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
3-element Vector{Float64}:
 0.0
 9.860761315262648e-32
 0.9999999999999996
vectors:
3×3 Matrix{Float64}:
 -2.21943e-17  -3.20494e-16   0.57735
  0.707107     -0.707107      0.57735
 -0.707107      0.707107     -0.57735

The first two columns of Q are linearly dependent and rank(Q) == 2. I get the same result with MKL. However, A is diagonalizable, and Maple finds the linearly independent vectors

julia> Q = [1 1 1; 0 -1 1; -1 0 -1]
3×3 Matrix{Int64}:
  1   1   1
  0  -1   1
 -1   0  -1

I’m confused because this is a tiny, real, diagonalizable matrix, but eigen still fails? This can be solved in closed form. Am I doing something wrong here? Or are my expectations of eigen incorrect?

numpy.linalg.eig gives the same problematic result:

import numpy as np
A = np.asarray([[1, 1, 1], [1, 1, 1], [-1, -1, -1]])
np.linalg.eig(A)

yields

EigResult(eigenvalues=array([1.00000000e+00, 9.86076132e-32, 0.00000000e+00]), eigenvectors=array([[ 5.77350269e-01, -3.20493781e-16, -2.21943272e-17],
       [ 5.77350269e-01, -7.07106781e-01,  7.07106781e-01],
       [-5.77350269e-01,  7.07106781e-01, -7.07106781e-01]]))

I’m not sure off the top of my head what is going on.

In Matlab:

>> A = [1 1 1; 1 1 1; -1 -1 -1]
A =
     1     1     1
     1     1     1
    -1    -1    -1
>> [V,D] = eig(A)
V =
    0.5774   -0.0000   -0.2955
    0.5774   -0.7071    0.8069
   -0.5774    0.7071   -0.5114
D =
    1.0000         0         0
         0    0.0000         0
         0         0   -0.0000

Weird, Matlab R2024b works for me too. But GNU Octave has the same problem:

octave:1> A = [1 1 1; 1 1 1; -1 -1 -1]
A =

   1   1   1
   1   1   1
  -1  -1  -1

octave:2> [V,D] = eig(A)
V =

   5.7735e-01  -3.2049e-16  -2.2194e-17
   5.7735e-01  -7.0711e-01   7.0711e-01
  -5.7735e-01   7.0711e-01  -7.0711e-01

D =

Diagonal Matrix

   1.0000        0        0
        0   0.0000        0
        0        0        0

Realize that all of these languages (Julia, Octave, Matlab, Numpy) should be calling the same underlying LAPACK routine for the eigenvalues and eigenvectors. So it could be that there is a bug in some versions of LAPACK on some platforms? Surprising that Matlab seems okay.

On my laptop, OpenBLAS succeeds while Apple Accelerate fails:

julia> D, Q = eigen(A)
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
3-element Vector{Float64}:
 -1.5391940822381474e-16
  1.8488927466117464e-31
  0.9999999999999999
vectors:
3×3 Matrix{Float64}:
 -0.553394   0.0       -0.57735
  0.796616  -0.707107  -0.57735
 -0.243222   0.707107   0.57735

julia> using AppleAccelerate

julia> D, Q = eigen(A)
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
3-element Vector{Float64}:
 -2.2058366349782935e-24
  2.2058367749154623e-24
  0.9999999999999999
vectors:
3×3 Matrix{Float64}:
 -6.84244e-9  -6.84244e-9  -0.57735
  0.707107    -0.707107    -0.57735
 -0.707107     0.707107     0.57735
1 Like

It is not diagonalizable since rank(Q)=2 so you can’t do A=QDQ^{-1} since Q^{-1} does not exist.

Furthermore you have two null eigenvalues, in this case I will not expect the eigenvectos matrix to have any particular meaningful value since any linear conbination of the two null eigenvectos is also a null eigenvector, in particular substracting the first colums of Maple Q you get aproximately the first two colums of Julia Q.

Indeed, on MacBook Air M2:

julia> A = [1 1 1; 1 1 1; -1 -1 -1]
3×3 Matrix{Int64}:
  1   1   1
  1   1   1
 -1  -1  -1

julia> using LinearAlgebra

julia> D, Q = eigen(A)
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
3-element Vector{Float64}:
 -1.5391940822381474e-16
  1.8488927466117464e-31
  0.9999999999999999
vectors:
3×3 Matrix{Float64}:
 -0.553394   0.0       -0.57735
  0.796616  -0.707107  -0.57735
 -0.243222   0.707107   0.57735

julia> using AppleAccelerate

julia> D, Q = eigen(A)
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
3-element Vector{Float64}:
 -2.2058366349782935e-24
  2.2058367749154623e-24
  0.9999999999999999
vectors:
3×3 Matrix{Float64}:
 -6.84244e-9  -6.84244e-9  -0.57735
  0.707107    -0.707107    -0.57735
 -0.707107     0.707107     0.57735
1 Like

The matrix is diagonalizable. The solver returning a Q with rank 2 is the problem. The zero eigenvalue is associated with a two-dimensional eigenspace, so the job of an eigensolver is to find two linearly independent vectors to form a basis for that eigenspace. Maple, Matlab, and in some cases OpenBLAS succeed with this, but in the other cases reported here, the solver is failing.

1 Like

Sorry, I misunderstood your post and thought, without checking, that the real Q has rank 2.

Probably the problem is numerically unstable, since any change in the components of the matrix A changes its rank to 2 or even 3 and this produces very different eigenvectors:

julia> A2 = [1 1 1+2*eps(); 1 1 1; -1 -1 -1]
3×3 Matrix{Float64}:
  1.0   1.0   1.0
  1.0   1.0   1.0
 -1.0  -1.0  -1.0

julia> eigen(A2)
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
3-element Vector{Float64}:
 0.0
 3.140184917367554e-16
 0.9999999999999992
vectors:
3×3 Matrix{Float64}:
 -0.399385  -5.76889e-16   0.57735
  0.816433  -0.707107      0.57735
 -0.417048   0.707107     -0.57735

julia> A3 = [1 1 1+2*eps(); 1 1+2*eps() 1; -1 -1 -1]
3×3 Matrix{Float64}:
  1.0   1.0   1.0
  1.0   1.0   1.0
 -1.0  -1.0  -1.0

julia> eigen(A3)
Eigen{ComplexF64, ComplexF64, Matrix{ComplexF64}, Vector{ComplexF64}}
values:
3-element Vector{ComplexF64}:
 2.612969163921257e-16 - 6.402950661486973e-16im
 2.612969163921257e-16 + 6.402950661486973e-16im
    0.9999999999999998 + 0.0im
vectors:
3×3 Matrix{ComplexF64}:
  0.233278-0.357074im   0.233278+0.357074im   0.57735+0.0im
  0.459281+0.357074im   0.459281-0.357074im   0.57735+0.0im
 -0.692559-0.0im       -0.692559+0.0im       -0.57735+0.0im 

It’s not obvious to me that the problem is ill-conditioned (i.e. very sensitive to small perturbations … “unstable” is technically an adjective that applies to algorithms). There is a repeated eigenvalue, but it’s not defective (or close to defective, I think?), so the eigenvalue should be(?) well conditioned. Of course, an arbitrary small perturbation will split the degeneracy and you’ll get different eigenvectors depending on the perturbation, but the subspace spanned by the two eigenvectors of \lambda = 0 should change smoothly with perturbations.

So, one should expect to get two independent eigenvectors for the zero (or near-zero) eigenvalue. Precisely which two eigenvectors you get could depend sensitively on the algorithm, but they should span the same space. And you shouldn’t get a rank-deficient eigvecs(A).

Let’s put it another way: is there a defective matrix very close to A?

It is not obvious to me either that it would be numerically unstable. All entries of A, D, and Q are in the range [-1,1], which should be prime time for floating precision. However, it fails with OpenBLAS, MKL, and MATLAB R2022b. Furthermore, after a simple permutation of the rows or casting to Float32, it works

julia> A = [-1 -1 -1; 1 1 1; 1 1 1];
julia> D, Q = eigen(A);
julia> Q ./ Q[end,:]'
3×3 Matrix{Float64}:
  0.0  -2.0  -1.0
 -1.0   1.0   1.0
  1.0   1.0   1.0

It would be good to check this explicitly for this matrix: a method is given in Sun (1992).

Update: I just checked this: according to their theorem 2.3, the condition number is opnorm(X1 * Y1') where X1 is a matrix of right-eigenvectors for the multiple eigenvalue and Y1 are the corresponding left eigenvectors. Here, X1 = X[:, 1:2] where X = [1 1 1; 0 -1 1; -1 0 -1] are the eigenvectors given in the first post above, and Y1 = Y[:, 1:2] where Y = [-1 1 1; -1 0 1; -2 1 1] = inv(X)' are the corresponding left eigenvectors. opnorm(X[:, 1:2] * Y[:, 1:2]') then gives 3.0 — with a condition number of 3, I would say that this eigenvalue is well-conditioned (i.e. it changes slowly with respect to small changes in the matrix).

2 Likes

Well, that’s quite interesting. I tried ArnoldiMethod, an alternative method to iterative QR. And I think I got linear independent results.

julia> using ArnoldiMethod, LinearAlgebra # for det

julia> A = [1 1 1; 1 1 1; -1 -1 -1]

julia> decomp, history = partialschur(A)
(ArnoldiMethod.PartialSchur{SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}}, true}, SubArray{Float64, 2, Matrix{Float64}, Tuple{UnitRange{Int64}, UnitRange{Int64}}, false}, ComplexF64}([0.577350269189626 -0.42529835755919954 0.6969849164251074; 0.5773502691896258 -0.3909574648991169 -0.7168116400466181; -0.5773502691896258 -0.8162558224583167 -0.01982672362151074], [1.0000000000000004 -2.8275931129434504 -0.0686817853201652; 0.0 -1.5839341551313139e-16 -6.938893903907228e-18; 0.0 0.0 -2.0993342935325818e-18], ComplexF64[1.0000000000000004 + 0.0im, -1.5839341551313139e-16 + 0.0im, -2.0993342935325818e-18 + 0.0im]), Converged: 3 of 3 eigenvalues in 3 matrix-vector products)

julia> decomp.R # the schur R is not diagonal
3×3 view(::Matrix{Float64}, 1:3, 1:3) with eltype Float64:
 1.0  -2.82759      -0.0686818
 0.0  -1.58393e-16  -6.93889e-18
 0.0   0.0          -2.09933e-18

julia> eigenvalues, eigenvectors = partialeigen(decomp)
([-1.5839341551313139e-16, -2.0993342935325818e-18, 1.0000000000000004], [0.40250992710306155 0.6812723048853335 0.577350269189626; 0.4139598923627391 -0.7303809855670701 0.5773502691896258; -0.816469819465801 0.04910868068173664 -0.5773502691896258])

julia> det(eigenvectors) # linear independence
-0.33255664599504337

julia> eigenvalues
3-element Vector{Float64}:
 -1.5839341551313139e-16
 -2.0993342935325818e-18
  1.0000000000000004

julia> eigenvectors
3×3 Matrix{Float64}:
  0.2247    -0.590232   0.57735
  0.567453  -0.193473   0.57735
 -0.792153   0.783705  -0.57735
julia> A * eigenvectors # still a little bit weird, the first column seems align with the third.
3×3 Matrix{Float64}:
  1.11022e-16  0.0   0.57735
  1.11022e-16  0.0   0.57735
 -1.11022e-16  0.0  -0.57735

Maybe there is vernulablity of iterative QR method with eigenvalue=0 subspace? This is a case where I naively think Lanczos-like methods would work poorly, since it depends on the magnitude of the eigenvalues.

:melting_face: BTW, thanks for posting this, reminding me of some weird non-invertible case in my code…

Maybe

Q\begin{bmatrix} 0 & \epsilon & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} Q^{-1}

It is arbitrary close to A, have a null eigenvalue with multiplicity 2 but the subspace associated with the null eigenvalue have dimensión 1.

5 Likes

You beat me to posting this. This sort of construction shows that if A is diagonalizable with a repeated eigenvalue, then every neighborhood of A includes matrices that are defective. Every neighborhood also includes matrices with distinct eigenvalues. If you use a stable algorithm, you at best get eigenvalues and eigenvectors of a nearby matrix. So variation in the number of eigenvectors found is consistent with a stable algorithm. With higher multiplicity eigenvalues, being diagonalizable doesn’t help all that much. This is really more like trying to determine the Jordan form of A in which you need to make decisions about Jordan block sizes of matrices close to A. That is known to be quite challenging and LAPACK isn’t even trying to do that.

To further complicate things, you really don’t have a very strong standard of stability for algorithms for non-normal eigenvalue problems. What you can expect from LAPACK is eigenvalue/eigenvector pairs that achieve a small relative residual. That is r_i = Ax_i - \lambda_i x_i will be such that

\frac{\|r_i\|}{\|A\| \|x_i\|}

is small. If you define

E_i = \frac{1}{\|x_i\|^2} r_i x_i^T

then (A-E_i) x_i - \lambda_i x_i^T =0 with

\frac{\|E_i\|}{\|A\|} = \frac{\|r_i\|}{\|A\| \|x_i\|}

which is just the relative residual norm. (Note I’m assuming vector/matrix 2-norm). So a small relative residual gives you a small backward error: x_i and \lambda_i are the eigenvector/value of a matrix close to A. But this is the backward error for a single eigenvector/value. There is no guarantee that there is a single suitably small E such that you get all the eigenvalues and eigenvectors of A+E. If you try to construct it somewhat more generally from the residuals, you might choose

E = \begin{bmatrix} r_1 & r_2 & \cdots & r_n \end{bmatrix} X^{-1}

where X is the matrix of eigenvectors. You then have
(A-E)X = XD where D is the diagonal matrix of eigenvalues. But the possibility that X might be ill conditioned (or even numerically singular) prevents you from concluding that E is small.

I think this is a nice example of some of the tricky things that happen when working with non-normal eigenvalue problems. It is quite interesting that Matlab gets different results. I don’t think it’s likely that it could be using any algorithm that gives strong guarantees, given all the difficulties involved.

Weird stuff can even happen with symmetric matrices if you use an algorithm that doesn’t try to preserve symmetry. Last fall, I had a student run into a similar issue using Python on a symmetric matrix. He had a symmetric matrix with repeated eigenvalues and used Numpy’s eig function (which uses the nonsymmetric algorithm) instead of the function eigh, which is specialized for symmetric problems. I don’t recall how bad it was, but the resulting eigenvector matrix was very clearly not orthogonal. I think it might even have been numerically singular with a duplicated eigenvector as in this problem. It is perhaps worth noting that Julia’s eigen function runs ishermitian(A) so it can choose a more reliable method even if you don’t wrap A with Hermitian. Given how tricky this stuff can be, I think that’s a good call. My student was doing a class project where I had planned the project by writing my own code in Julia and I was surprised that he had troubles.

6 Likes

Thanks for your thorough explaination. Just want to confirm that why will Hermitian algorithms more reliable on this non-normal eigenvalue problem?

The Hermitian algorithms aren’t really applicable to your problem. I didn’t mean to suggest they would apply here. In floating point, with rounding, the most you can expect is to get eigenvalues and eigenvectors of a nearby matrix. Since for this A, every neighborhood of A contains matrices for which 0 is an eigenvalue with a 2\times 2 Jordan block and one eigenvector, as soon as rounding is involved, you lose information about what the Jordan structure of the original matrix A really was.

This is inherent in trying to numerically compute the Jordan form and is one reason that it is usually best to avoid trying to compute it. So a more focused answer to your question “Or are my expectations of eigen incorrect?” would be: Yes. What you were expecting is equivalent to determining Jordan block sizes with rounding, which is not a well posed problem. Jordan block sizes do not depend continuously on the elements of the matrix.
You are expecting too much from a numerical algorithm working with rounded floating point arithmetic.

Possibly Maple does something nice and exact with an integer matrix. And I think Matlab probably just did something different in a minor way and the rounding played out in a better way by chance. But it’s much harder to figure out what Matlab is doing relative to Julia.

And with that said, there are algorithms that try a bit harder to find nearby Jordan structures. But they are still not entirely reliable. I don’t know if anything is implemented in Julia. I googled a little and didn’t see anything. The original reference on this is by Golub and Wilkinson if you want to see more. But in general, the standard advice is to avoid these sorts of computations if you can.

5 Likes

Thanks for all the responses. I am still a bit surprised by the result, but now I understand better why it happens.

Interestingly, while A is not symmetric here, it can be factored and expressed as a symmetric generalized eigenvalue problem

eigen(ones(3,3), Diagonal([1,1,-1]))

but it yields the same result. Maybe it uses the same method under the hood.

The Diagonal rhs matrix here is indefinite, so the good Hermitian-eigenproblem algorithms don’t apply.

1 Like

I’m confused. @stevengj concludes that the eigenproblem for the \lambda = 0 eigenvalue of this matrix is fairly well-conditioned. @mstewart explains that the eigenproblem for a repeated eigenvalue of a non-normal matrix is not even well posed. How can I reconcile this?