You don’t even need an explicit Cholesky factorization, since for symmetric tridiagonal matrices I seem to recall that we have a T \ b
method that works directly with T
(and is also O(n)). In fact, there is an ldiv!(T, b; shift=µ)
method that computes (T + µ*I) \ b
in-place in b
. Update: checking the source code it seems that it does form an ldlt
factorization explicitly, however, though it is still true that it is O(n); in principle this could probably be elided into the solve.
If you compute a Hessenberg factorization object F = hessenberg(A)
in Julia, this is exploited under the hood. As explained in the documentation it directly supports shifts: (F + µ*I) \ b
or ldiv!(F + µ*I, b)
is very cheap (O(n) for Hermitian matrices, and O(n^2) for general matrices).
What is (currently) missing is a BandedMatrices.jl method that creates a Hessenberg
factorization object from a banded matrix, with an implicit representation of Q as a product of Givens rotations. This should be straightforward to add, in principle, since much of the code is already written.
The Hessenberg approach could be an especially big win if you are solving for different shifts with the same right-hand side, since in that case you may only need to apply the Q transformation once and then just do tridiagonal solves for each shift.
Finding the Hessenberg factorization is significantly cheaper than computing the diagonalization of the matrix — in fact, the first step of diagonalization is usually a Hessenberg factorization, so this saves you the subsequent iterative diagonalization steps.
(For general square matrices, the analogous thing to diagonalization for shifted solves would typically be a Schur factorization, not an SVD, but again Hessenberg is the first step of a Schur calculation so working with Hessenberg is cheaper.)
But again, for banded matrices with small bandwidth, every reasonable algorithm is linear O(n) time, including just re-computing the Cholesky factorization for each shift — so it boils down to a battle of the constant factors and which method is more optimized. It’s not obvious a priori which method will “win”, but on the other hand they are all so fast that it may not matter much.