Numerically stable computation of matrix power

I admit the title is not great, but I couldn’t find a better one.

I have a matrix M whose eigenvalues lie in the unit disk and the only eigenvalue on the unit cycle is 1. Moreover the algebraic multiplicity of the eigenvalue 1 is equal to its geometric multiplicity, i.e. there is no Jordan block associated with it.

Let M=PJP^{-1} be the Jordan decomposition of M. The matrix J is M's Jordan form and the limit J_\infty = \lim_{n\to\infty}J^n is well defined. Finally I define M_\infty=PJ_\infty P^{-1}.

I want to compute M_\infty numerically. The problem is that numerically computing the Jordan decomposition is not stable. Is there any numerically stable way to compute M_\infty?

If you know this property of the spectrum analytically, then you just need to find a basis Q for the nullspace of M-I, in which case M^\infty = QQ^* is simply a projection onto this subspace. The function Q = nullspace(M-I) in the LinearAlgebra standard library will do this for you using the SVD. Correction: a non-orthogonal projection is needed, as described below.

A problem still arises, of course, if your matrix has eigenvalues extremely close to 1, such that they can’t be distinguished from nullspace(M-I) to machine precision.

4 Likes

I think this is the solution. I suspect that I don’t have a problem with eigenvalues that are arbitrarily close to 1.

Ah, I think that QQ^* is not M_\infty but J_\infty.

No. J_\infty is diagonal

I have an example where this does not work. I wrote it in Julia because I’m not sure how to write matrices in latex here.

julia> using LinearAlgebra

julia> M = [1 0 1 0; 0 1 0 1/2; 0 0 0 1/2; 0 0 0 0]
4×4 Array{Float64,2}:
 1.0  0.0  1.0  0.0
 0.0  1.0  0.0  0.5
 0.0  0.0  0.0  0.5
 0.0  0.0  0.0  0.0

julia> A = M - I
4×4 Array{Float64,2}:
 0.0  0.0   1.0   0.0
 0.0  0.0   0.0   0.5
 0.0  0.0  -1.0   0.5
 0.0  0.0   0.0  -1.0

For this specific matrix M^n=M^2 for any n>1:

julia> M*M
4×4 Array{Float64,2}:
 1.0  0.0  1.0  0.5
 0.0  1.0  0.0  0.5
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

julia> M*M*M
4×4 Array{Float64,2}:
 1.0  0.0  1.0  0.5
 0.0  1.0  0.0  0.5
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

The nullspace of A is straightforward:

julia> Q = nullspace(A)
4×2 Array{Float64,2}:
  0.0  1.0
 -1.0  0.0
  0.0  0.0
  0.0  0.0

However QQ^* is not M^2:

julia> Q*transpose(Q)
4×4 Array{Float64,2}:
 1.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

Sorry, you’re quite right — the basic problem is that the nullspace of M-I is not orthogonal to the other Jordan vectors of M, so the correct projection is non-orthogonal.

I think what you need, instead, is to project using left and right nullspaces. I came up with the following a bit hastily, but it looks right to me and works for your example:

function Minf(M)
    R = nullspace(M-I)
    L = nullspace(M'-I)'
    return R / (L*R) * L
end

You might want to check it a bit more carefully.

3 Likes

I think this is the solution. I will check it more, but in simple cases it works.

Note that my Minf implementation is suboptimal (if you care about factors of 2), since really you can get the left and right nullspaces simultaneously from the left and right singular vectors of the SVD. Just look at the source code for nullspace and pull out the relevant bits.

3 Likes

Could you please explain briefly how you came up with this? Why does it work? I cannot figure it out. :confused:

1 Like

Let us analyze the case where M is diagonalizable, since the defective case of a non-diagonalizable matrix (Jordan blocks) can be obtained at the end as the limit of diagonalizable matrices.

The diagonalization is M = R\Lambda R^{-1} where R is a matrix whose columns are (right) eigenvectors. Then M^T = (R^{-1})^T \Lambda R^T, but this is also a diagonalization and so we see by inspection that (R^{-1})^T = L, a matrix whose columns are left eigenvectors. (This is one proof of the general property that we can choose L^T R = I, i.e. left and right eigenvectors can be chosen orthonormal under the “unconjugated inner product” x^T y.)

That is, M = R \Lambda L^T with L^T R = I, and it follows that M^n = R \Lambda^n L^T.

In your case, all of the eigenvalues have |\lambda|<1 except for eigenvalues with \lambda=1, so the limit \Lambda^\infty is a diagonal matrix that projects onto the rows/columns corresponding to the \lambda=1 eigenvalues. Hence M^\infty = R \Lambda^\infty L^T = R_0 L_0^T, where L_0 and R_0 are the left and right eigenvectors for \lambda=1, equivalent to left and right nullspaces \hat{L}_0 and \hat{R}_0 of M-I. Technically, the left nullspace is the nullspace of M^* - I, but this conjugation is cancelled if we also use M^\infty = \hat{R}_0 \hat{L}_0^* instead of the transpose.

However, we have to be careful, especially in the case where \lambda=1 has multiplicity > 1, since there are multiple possible choices of the left and right eigenvectors/nullspaces and we have to choose them consistently to ensure that \hat{L}_0^* \hat{R}_0 = I. If we choose them arbitrarily, we might in general need some change of basis \hat{R}_0 \to \hat{R}_0 B, giving M^\infty = \hat{R}_0 B \hat{L}_0^* for some (invertible) B. We can easily determine B by the fact that M^\infty \hat{R}_0 = \hat{R}_0, which immediately gives B = (\hat{L}_0^* \hat{R}_0)^{-1}.

So, in the end, for any diagonalizable matrix M with the spectrum you assumed, we have M^\infty = \hat{R}_0 (\hat{L}_0^* \hat{R}_0)^{-1} \hat{L}_0^*, in terms of left and right nullspace bases \hat{L}_0 and \hat{R}_0 of M-I (regardless of how they are obtained/normalized).

For a defective matrix M, as long as the \lambda=1 eigenvalue is not itself defective, if we take the defective matrix M as the limit of diagonalizable matrices (which is always possible since diagonalizable matrices are a dense subset of all square matrices), we get the same answer (the limit of the \lambda=1 eigenvectors still span the nullspaces of M-I).

PS. I guess you get this spectrum from some kind of Markov property?

9 Likes

Thank you! That was very clear. Yes, it is the matrix of a random walk on a weakly connected graph where most of the vertices are transient.

IIUC this would also work:

function Minf2(M, tol=nothing)
  if tol === nothing
    tol = 10*size(M,1)*eps(real(eltype(M)))
  end
  S = schur(M)
  select = abs.(S.values .- 1) .< tol
  S1 = ordschur(S, select)
  n1 = count(select)
  return S1.Z[:,1:n1] * (S1.Z[:,1:n1])'
end

You might want to verify that S1.T[1:n1,1:n1] really is close to the identity matrix, as a check on construction of M and choice of tolerance.

There are also interesting possibilities for iterative methods in case your graphs are large.

Nope. Try it on @gmout’s example from above, for example.

You’re right, of course. I left out a “left” in my thinking; I’d have to do a second Schur decomposition and a linear solve like yours, so the suggested SVD approach is more efficient and just as stable AFAICT.