Why re-use factorization(A) rather than inv(A)

For a dense m \times m matrix, using the factorization to left-divide every time involves:

  1. Computing the factorization (typically LU) A = LU (technically PA = LU including pivots, but let me leave that out for simplicity). This has \Theta(m^3) cost, but is only done once for all right-hand-sides. F = lu(A) stores L, U, and P implicitly in F.
  2. Using the factorization to solve Ax = LUx = b for m for each right-hand side, which has \Theta(m^2) cost. In particular, olving LUx = b \Longleftrightarrow x = U^{-1} (L^{-1} b) involves \Theta(m^2) two triangular solves: c = L^{-1} b \Longleftrightarrow Lc = b is computed by forward substitution (top to bottom) and x = U^{-1}c \Longleftrightarrow Ux = b is computed by backsubstitution (bottom to top). x = F \ b does this.

Note that using the factorization to solve Ax = b has \Theta(m^2) cost similar to a matrix–vector multiply, and indeed computing F \ b is roughly comparable to the cost of multiplying x = A^{-1} b if you had precomputed the matrix inverse (though the constant factors for the latter are a bit better).

Inversion of a dense m \times m matrix involves two steps:

  1. Computing the factorization (typically LU) A = LU (technically PA = LU including pivots, but let me leave that out for simplicity). This has \Theta(m^3) cost.
  2. Using the factorization to solve Ax = LUx = b for m right-hand sides, corresponding to AX = I. Each right hand side (each column of I) has \Theta(m^2) cost, so all of these solves together have \Theta(m^3) cost.

(You may have learned the Gauss–Jordan algorithm for A^{-1} in school. This is equivalent to those two steps: Gaussian elimination on A is equivalent to LU factorization, applying the same elimination steps to I is equivalent to forward-substitution C = L^{-1} I, and then the “upwards” elimination is equivalent to backsubsitution U^{-1} C.) Given the inverse,

  1. Applying A^{-1} to a vector is \Theta(m^2) cost, comparable to a triangular solve (though the constant factors are a bit better).

Thus, matrix inversion is equivalent to solving m right-hand sides, so it is definitely not worth it if you are solving \ll m right-hand-sides, e.g. you are doing just a couple Newton steps. It might be worth it if you are solving \gg m right-hand sides, because the constant factors for a matrix–vector multiply are better than for two triangular solves, but this seems to be a relatively rare situation.

In general, you should get in the habit of never computing matrix inverses. Read A^{-1} b as “solve Ax=b by the best-available method.”

PS. Computing inverses is much, much worse if A is a sparse matrix, because in that case L and U are sparse but A^{-1} is typically dense.

PPS. Sometimes, people argue that solving Ax = b by the matrix inversion is much less accurate because it involves additional steps. Druinsky and Toledo (2012) argue that this is not typically a major concern, however.

14 Likes