Numerically stable computation of matrix power

gmouts · November 27, 2019, 4:20pm

I admit the title is not great, but I couldn’t find a better one.

I have a matrix M whose eigenvalues lie in the unit disk and the only eigenvalue on the unit cycle is 1. Moreover the algebraic multiplicity of the eigenvalue 1 is equal to its geometric multiplicity, i.e. there is no Jordan block associated with it.

Let M=PJP^{-1} be the Jordan decomposition of M. The matrix J is M's Jordan form and the limit J_\infty = \lim_{n\to\infty}J^n is well defined. Finally I define M_\infty=PJ_\infty P^{-1}.

I want to compute M_\infty numerically. The problem is that numerically computing the Jordan decomposition is not stable. Is there any numerically stable way to compute M_\infty?

stevengj · November 27, 2019, 4:34pm

If you know this property of the spectrum analytically, then you just need to find a basis Q for the nullspace of M-I, in which case M^\infty = QQ^* is simply a projection onto this subspace. The function Q = nullspace(M-I) in the LinearAlgebra standard library will do this for you using the SVD. Correction: a non-orthogonal projection is needed, as described below.

A problem still arises, of course, if your matrix has eigenvalues extremely close to 1, such that they can’t be distinguished from nullspace(M-I) to machine precision.

gmouts · November 27, 2019, 4:40pm

I think this is the solution. I suspect that I don’t have a problem with eigenvalues that are arbitrarily close to 1.

gmouts · November 27, 2019, 5:14pm

Ah, I think that QQ^* is not M_\infty but J_\infty.

stevengj · November 27, 2019, 5:34pm

No. J_\infty is diagonal

gmouts · November 27, 2019, 6:01pm

I have an example where this does not work. I wrote it in Julia because I’m not sure how to write matrices in latex here.

julia> using LinearAlgebra

julia> M = [1 0 1 0; 0 1 0 1/2; 0 0 0 1/2; 0 0 0 0]
4×4 Array{Float64,2}:
 1.0  0.0  1.0  0.0
 0.0  1.0  0.0  0.5
 0.0  0.0  0.0  0.5
 0.0  0.0  0.0  0.0

julia> A = M - I
4×4 Array{Float64,2}:
 0.0  0.0   1.0   0.0
 0.0  0.0   0.0   0.5
 0.0  0.0  -1.0   0.5
 0.0  0.0   0.0  -1.0

For this specific matrix M^n=M^2 for any n>1:

julia> M*M
4×4 Array{Float64,2}:
 1.0  0.0  1.0  0.5
 0.0  1.0  0.0  0.5
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

julia> M*M*M
4×4 Array{Float64,2}:
 1.0  0.0  1.0  0.5
 0.0  1.0  0.0  0.5
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

The nullspace of A is straightforward:

julia> Q = nullspace(A)
4×2 Array{Float64,2}:
  0.0  1.0
 -1.0  0.0
  0.0  0.0
  0.0  0.0

However QQ^* is not M^2:

julia> Q*transpose(Q)
4×4 Array{Float64,2}:
 1.0  0.0  0.0  0.0
 0.0  1.0  0.0  0.0
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

stevengj · November 27, 2019, 6:35pm

Sorry, you’re quite right — the basic problem is that the nullspace of M-I is not orthogonal to the other Jordan vectors of M, so the correct projection is non-orthogonal.

I think what you need, instead, is to project using left and right nullspaces. I came up with the following a bit hastily, but it looks right to me and works for your example:

function Minf(M)
    R = nullspace(M-I)
    L = nullspace(M'-I)'
    return R / (L*R) * L
end

You might want to check it a bit more carefully.

gmouts · November 27, 2019, 6:42pm

I think this is the solution. I will check it more, but in simple cases it works.

stevengj · November 27, 2019, 6:46pm

Note that my Minf implementation is suboptimal (if you care about factors of 2), since really you can get the left and right nullspaces simultaneously from the left and right singular vectors of the SVD. Just look at the source code for nullspace and pull out the relevant bits.

gmouts · November 28, 2019, 12:49pm

Could you please explain briefly how you came up with this? Why does it work? I cannot figure it out.

stevengj · November 28, 2019, 1:56pm

Let us analyze the case where M is diagonalizable, since the defective case of a non-diagonalizable matrix (Jordan blocks) can be obtained at the end as the limit of diagonalizable matrices.

The diagonalization is M = R\Lambda R^{-1} where R is a matrix whose columns are (right) eigenvectors. Then M^T = (R^{-1})^T \Lambda R^T, but this is also a diagonalization and so we see by inspection that (R^{-1})^T = L, a matrix whose columns are left eigenvectors. (This is one proof of the general property that we can choose L^T R = I, i.e. left and right eigenvectors can be chosen orthonormal under the “unconjugated inner product” x^T y.)

That is, M = R \Lambda L^T with L^T R = I, and it follows that M^n = R \Lambda^n L^T.

In your case, all of the eigenvalues have |\lambda|<1 except for eigenvalues with \lambda=1, so the limit \Lambda^\infty is a diagonal matrix that projects onto the rows/columns corresponding to the \lambda=1 eigenvalues. Hence M^\infty = R \Lambda^\infty L^T = R_0 L_0^T, where L_0 and R_0 are the left and right eigenvectors for \lambda=1, equivalent to left and right nullspaces \hat{L}_0 and \hat{R}_0 of M-I. Technically, the left nullspace is the nullspace of M^* - I, but this conjugation is cancelled if we also use M^\infty = \hat{R}_0 \hat{L}_0^* instead of the transpose.

However, we have to be careful, especially in the case where \lambda=1 has multiplicity > 1, since there are multiple possible choices of the left and right eigenvectors/nullspaces and we have to choose them consistently to ensure that \hat{L}_0^* \hat{R}_0 = I. If we choose them arbitrarily, we might in general need some change of basis \hat{R}_0 \to \hat{R}_0 B, giving M^\infty = \hat{R}_0 B \hat{L}_0^* for some (invertible) B. We can easily determine B by the fact that M^\infty \hat{R}_0 = \hat{R}_0, which immediately gives B = (\hat{L}_0^* \hat{R}_0)^{-1}.

So, in the end, for any diagonalizable matrix M with the spectrum you assumed, we have M^\infty = \hat{R}_0 (\hat{L}_0^* \hat{R}_0)^{-1} \hat{L}_0^*, in terms of left and right nullspace bases \hat{L}_0 and \hat{R}_0 of M-I (regardless of how they are obtained/normalized).

For a defective matrix M, as long as the \lambda=1 eigenvalue is not itself defective, if we take the defective matrix M as the limit of diagonalizable matrices (which is always possible since diagonalizable matrices are a dense subset of all square matrices), we get the same answer (the limit of the \lambda=1 eigenvectors still span the nullspaces of M-I).

PS. I guess you get this spectrum from some kind of Markov property?

gmouts · November 28, 2019, 6:11pm

Thank you! That was very clear. Yes, it is the matrix of a random walk on a weakly connected graph where most of the vertices are transient.

Ralph_Smith · November 29, 2019, 1:57pm

IIUC this would also work:

function Minf2(M, tol=nothing)
  if tol === nothing
    tol = 10*size(M,1)*eps(real(eltype(M)))
  end
  S = schur(M)
  select = abs.(S.values .- 1) .< tol
  S1 = ordschur(S, select)
  n1 = count(select)
  return S1.Z[:,1:n1] * (S1.Z[:,1:n1])'
end

You might want to verify that S1.T[1:n1,1:n1] really is close to the identity matrix, as a check on construction of M and choice of tolerance.

There are also interesting possibilities for iterative methods in case your graphs are large.

stevengj · November 29, 2019, 2:55pm

Nope. Try it on @gmout’s example from above, for example.

Ralph_Smith · November 29, 2019, 4:51pm

You’re right, of course. I left out a “left” in my thinking; I’d have to do a second Schur decomposition and a linear solve like yours, so the suggested SVD approach is more efficient and just as stable AFAICT.

Topic		Replies	Views
Review: schur-pade matrix powers speedup? General Usage performance , review	19	2799	February 24, 2017
Jordan form of a matrix Specific Domains linearalgebra	9	6700	November 19, 2017
Get eigenvectors of non-normal, defective matrices in julia Specific Domains question , linearalgebra	7	1083	December 1, 2020
Repeated multiplication of square matrix General Usage	32	1957	January 21, 2019
Sparse SVD (smallest SVs) General Usage	21	4781	May 18, 2017

Numerically stable computation of matrix power

Related topics