Thomas algorithm for banded matrices

rveltz · April 18, 2019, 6:16am

Dear All,

I am trying to solve linear systems Ax = rhs where A has a tridiagonal shape. When the type of A is tridiagonal, one can use the thomas algo to solve it. I find it faster than defining A as a Tridiagonal and calling \. Indeed, the actual implementation of \ uses a LU decomposition followed by a solve. Maybe the choice of not relying on \ for Tridiagonal is numerical stability of the algorithm.

I want to do something slighly different. I want to solve the same problem for a banded matrix with a diagonal band and two other bands. When the bandwith is 1, one recovers the case above of the Thomas algo.

I have the MWE below, but it does not work. As you can see, I first compute the LU decomposition of the matrix and then solve the linear systems. I have two issues I cannot solve

Firstly despite my analytical formula, my LU decomposition seems wrong.
Secondly, I am able to invert L but I am unable to invert U which I am ashamed of.

If anyone has an idea, I’d be delighted.

Thank you a lot.

using LinearAlgebra, SparseArrays
function thomas2bands!(a::vec, b::vec, c::vec, rhs::vec) where {T, vec <: Vector{T}}
	@assert length(a) == length(c)
	n = length(b)
	p = length(a)
	# we first derive the LU decomposition
	# we have L and U written 8 lines below, L = [l, 1] and U = [v, c]
	v = copy(b)
	l = a ./ v[1:p]
	for ii = n-p+1:n
		v[ii] = b[ii] - l[ii-n+p] * c[ii-n+p]
	end
	# up to here, this has been check, the LU decomposition seems OK
	@show n-p
	L = spdiagm(-n+p => l, 0 => fill(1.0, n))
	U = spdiagm(0 => v, n-p => c)
	A = spdiagm(-n+p => a, 0 => b, n-p => c)
	println("-> A - LU = ", norm(L * U - A, Inf64))

	# we want to solve LUx = rhs
	# we first solve Ly = rhs
	y = similar(rhs)
	for ii=1:n-p
		y[ii] = rhs[ii]
	end
	for ii=n-p+1:n
		y[ii] = rhs[ii] - l[ii-n+p] * y[ii-n+p]
	end
	println("-> L \\ rhs = ",norm(y -  (L \ rhs), Inf64))

	# we solve Ux = y
	x = similar(rhs)
	for ii = n-p+1:n
		x[ii] = y[ii] / v[ii]
	end
	for ii=1:n-p
		x[ii] = y[ii] / v[ii] - x[ii+n-p] * c[ii] / v[ii]
	end
	println("-> y - U * x   = ",norm((y -  (U * x)), Inf64))
	println("-> U \\ y   = ",norm(x -  (U \ y),Inf64))
	return x
end
	sol = @time thomas2bands!(rand(10^5-10^3), 2 .+ rand(10^5), rand(10^5-10^3), rand(10^5) )

tlienart · April 18, 2019, 8:53am

I think the problem is that in the analytical case you consider to build the full formula, you assume that there is no column where both off diagonal terms are non-zero. For instance consider the 3d column in

rveltz · April 18, 2019, 9:00am

Thank you for your reply! When I multiply your matrices, I get

 1  0  1  0  0
 0  1  0  1  0
 1  0  2  0  1
 0  1  0  2  0
 0  0  1  0  2

It seems it has the correct sparsity pattern. Did you look at my Maple document?

tlienart · April 18, 2019, 9:04am

Ha sorry I was checking what I wrote on a piece of paper as you typed your response and realised that’s not it (sorry) and yes I did look at the maple document I thought it was missing a case and that it could explain the problem.

Edit: there still seems to be a problem with the lower diagonal. When doing

sol, R = @time thomas2bands!(rand(3), 2 .+ rand(5), rand(3), rand(5) )

where R is Matrix(A-L*U) I get

5×5 Array{Float64,2}:
 0.0  0.0  0.0        0.0  0.0
 0.0  0.0  0.0        0.0  0.0
 0.0  0.0  0.0        0.0  0.0
 0.0  0.0  0.0        0.0  0.0
 0.0  0.0  0.0190783  0.0  0.0

however when going with 6, 3 then it’s fine. with 7 and 5 the off diagonal is wrong for the last 3 elements.

rveltz · April 18, 2019, 9:18am

I agree!! That’s my first mistake I can’t solve. This is strange because the v part seems fine despite involving l which seems to be wrong…

You can check this by doing lu(A), the sparsity is the same as I assumed for L and U…

BTW: I updated the document with the formula.

tlienart · April 18, 2019, 9:25am

Actualy that’s just what I did and it doesn’t seem to be the case? (this is for 7/5)

julia> L
6×6 Array{Float64,2}:
 1.0       0.0       0.0  0.0  0.0  0.0
 0.0       1.0       0.0  0.0  0.0  0.0
 0.0       0.0       1.0  0.0  0.0  0.0
 0.0       0.0       0.0  1.0  0.0  0.0
 0.823648  0.0       0.0  0.0  1.0  0.0
 0.0       0.910357  0.0  0.0  0.0  1.0

julia> luL
7×7 Array{Float64,2}:
 1.0       0.0       0.0        0.0       0.0      0.0  0.0
 0.0       1.0       0.0        0.0       0.0      0.0  0.0
 0.220607  0.0       1.0        0.0       0.0      0.0  0.0
 0.0       0.209361  0.0        1.0       0.0      0.0  0.0
 0.0       0.0       0.0931903  0.0       1.0      0.0  0.0
 0.0       0.0       0.0        0.324195  0.0      1.0  0.0
 0.0       0.0       0.0        0.0       0.06044  0.0  1.0

julia> U
6×6 Array{Float64,2}:
 0.164566  0.0       0.0      0.0       0.361828   0.0      
 0.0       0.177329  0.0      0.0       0.0        0.973216 
 0.0       0.0       0.27888  0.0       0.0        0.0      
 0.0       0.0       0.0      0.203477  0.0        0.0      
 0.0       0.0       0.0      0.0       0.0423017  0.0      
 0.0       0.0       0.0      0.0       0.0        0.0682693

julia> luU
7×7 Array{Float64,2}:
 2.65545  0.0      0.353129  0.0       0.0       0.0       0.0      
 0.0      2.57589  0.0       0.767602  0.0       0.0       0.0      
 0.0      0.0      2.79038   0.0       0.043141  0.0       0.0      
 0.0      0.0      0.0       2.80709   0.0       0.267985  0.0      
 0.0      0.0      0.0       0.0       2.76367   0.0       0.0668464
 0.0      0.0      0.0       0.0       0.0       2.38242   0.0      
 0.0      0.0      0.0       0.0       0.0       0.0       2.05833

Edit: hmmm the dimensions are not quite right… could that be it?

rveltz · April 18, 2019, 9:29am

lu(Matrix(A)) I guess

Also, there is a mistake in the definition of L which I corrected

rveltz · April 19, 2019, 6:49am

It seems it works when the bandwith is even actually which is quite puzzling. Maybe a mistake with the indices

tlienart · April 19, 2019, 7:08am

Ok I spent quite some time on this because I thought it was fun. Here’s an algo that actually works, note that the decomposition is actually a bit trickier.

It’s just a bit annoying with the indices floating around


function decomp(A, p)
    n = size(A, 1)

    λ = zeros(p)
    μ = zeros(p)
    v = zeros(n)

    for i=1:n-p
        v[i] = A[i,i]
    end

    for j = n-p+1:n
        μ[j-n+p] = A[j-n+p, j]
    end

    for k=1:Int(floor(n/(n-p)))
        for i=(k*(n-p)+1):min((k+1)*(n-p), n)
            λ[i-n+p] = A[i,i-n+p] / v[i-n+p]
            v[i] = A[i, i] - λ[i-n+p] * μ[i-n+p]
        end
    end

    v, λ, μ
end

buildL(λ, n) = diagm(-n+length(λ) => λ) + I

buildU(v, μ, n) = diagm(0=>v, n-length(μ)=>μ)


n, p = 7, 5

v = randn(n)
λ = randn(p)
μ = randn(p)

L = buildL(λ, n)
U = buildU(v, μ, n)

A = L*U

rv, rλ, rμ = decomp(A, p)

norm(rv - v)
norm(rλ - λ)
norm(rμ - μ)

Edit: to get to that, I wrote down the equation for rows of L and columns of U and computed what their dot product looks like in general. This gave me:

function rowL(i, λ, n)
    p = length(λ)
    ri = zeros(n)
    for j = 1:n
        ri[j] = (j==i ? 1.0 : 0.0) +
                   ((j == i-n+p) ? λ[j] : 0.0)
    end
    ri
end

function colU(j, v, μ)
    n  = length(v)
    p  = length(μ)
    cj = zeros(n)
    for i = 1:n
        cj[i] = (i==j ? v[i] : 0.0) +
                    ((i == j-n+p) ? μ[i] : 0.0)
    end
    cj
end

function LU(i, j, v, λ, μ)
    p = length(λ)
    n = length(v)

    ri = rowL(i, λ, n)
    cj = colU(j, v, μ)

    gt = dot(ri, cj)

    hand = (i == j ? v[i] + (i ≥ n-p+1 ? λ[i-n+p]*μ[i-n+p] : 0.0) : 0.0) +
           (j == i-n+p ? λ[i-n+p]*v[i-n+p] : 0.0) +
           (i == j-n+p ? μ[i] : 0.0)

#    @show  gt-hand
    return hand
end

The expressions here are not simplified to clarify how I got an expression for a[i, j]. Once I had the LU bit, I reversed the steps and realised that there was a trick where you can only get v and lambda in blocks (as per previous algo)

Edit2 what I mean by “blocks” is that the equations allow you to get things from n-p+1 to 2n-2p and then from 2n-2p+1 to 3n-3p etc until you bottom out. But past n-p+1 you cannot in general get all remaining v and lambda in one shot.

I think the Maple document you had doesn’t show this because p is too small but if you take the n=7 and p=5 case, you can see how some of this plays out.

Edit3: it may be that the decomposition I suggest can be further simplified, I didn’t really try, I just know that it’s correct as I tried a big bunch of n and p and the relative error is always around 1e-15.

tlienart · April 19, 2019, 7:37am

Here’s your code adjusted


using LinearAlgebra, SparseArrays
function thomas2bands!(a::vec, b::vec, c::vec, rhs::vec) where {T, vec <: Vector{T}}
	@assert length(a) == length(c)
	n = length(b)
	p = length(a)
	# we first derive the LU decomposition
	# we have L and U written 8 lines below, L = [l, 1] and U = [v, c]

	λ = zeros(p)
    μ = zeros(p)
    v = zeros(n)

    for i=1:n-p
        v[i] = b[i]
    end

    for k=1:Int(floor(n/(n-p)))
        for i=(k*(n-p)+1):min((k+1)*(n-p), n)
            λ[i-n+p] = a[i-n+p] / v[i-n+p]
            v[i] = b[i] - λ[i-n+p] * c[i-n+p]
        end
    end

	# up to here, this has been check, the LU decomposition seems OK
	@show n-p
	L = spdiagm(-n+p => λ, 0 => fill(1.0, n))
	U = spdiagm(0 => v, n-p => c)
	A = spdiagm(-n+p => a, 0 => b, n-p => c)
	println("-> A - LU = ", norm(L * U - A, Inf64))

	# we want to solve LUx = rhs
	# we first solve Ly = rhs
	y = similar(rhs)
	for ii=1:n-p
		y[ii] = rhs[ii]
	end
	for ii=n-p+1:n
		y[ii] = rhs[ii] - λ[ii-n+p] * y[ii-n+p]
	end
	println("-> L \\ rhs = ",norm(y -  (L \ rhs), Inf64))

	# we solve Ux = y
	x = similar(rhs)
	for ii = n-p+1:n
		x[ii] = y[ii] / v[ii]
	end
	for ii=1:n-p
		x[ii] = y[ii] / v[ii] - x[ii+n-p] * c[ii] / v[ii]
	end
	println("-> y - U * x   = ",norm((y -  (U * x)), Inf64))
	println("-> U \\ y   = ",norm(x -  (U \ y),Inf64))
	return x
end
	sol = @time thomas2bands!(rand(10^5-10^3), 2 .+ rand(10^5), rand(10^5-10^3), rand(10^5))

This gives

n - p = 1000
-> A - LU = 4.440892098500626e-16

(I didn’t check the rest)

rveltz · April 19, 2019, 4:00pm

Thank you a lot for your intense help!

I will digest this.

rveltz · May 1, 2019, 7:36am

Hey

Sorry for coming back late. The following work for me

function thomas2bands!(a, b, c, d; debug = false)
	@assert length(a) == length(c)
	@assert length(b) == length(d)
	debug && println("#"^20)
	n = length(b)
	p = length(a)
	# we first derive the LU decomposition
	# we have L and U written 8 lines below, L = [l, 1] and U = [v, c]
	v = similar(b)
	l = similar(a)
	for i=1:n-p
		v[i] = b[i]
	end

	for i=n-p+1:n
		l[i-n+p] = a[i-n+p] / v[i-n+p]
		v[i] = b[i] - l[i-n+p] * c[i-n+p]
	end
	# up to here, this has been check, the LU decomposition seems OK

	debug && (L = spdiagm(-n+p => l, 0 => fill(1.0, n)))
	debug && (U = spdiagm(0 => v, n-p => c))
	debug && (A = spdiagm(-n+p => a, 0 => b, n-p => c))
	debug && println("-> A - LU = ", norm(L * U - A, Inf64))
end

JosephPollacco · March 31, 2020, 6:12am

Hello Romain,

I am solving a Richards equation (hydrology) by using a tridiagonal nonlinear set of equations in hydrology by using the Newton-Raphson method. I understand that your code will do the job. Is your posted code the latest version?

Thanks for any help you may provide?

rveltz · March 31, 2020, 12:29pm

Hi,

You may be mistaken. I was interested in a generalisation of the Thomas algorithm when there are 5 bands. You can invert your matrix by making it a Tridiagonal and calling \.

Topic		Replies	Views
Linear solve slower with Banded Matrix General Usage linearalgebra	8	147	September 25, 2024
How to improve the Thomas algorithm for block tridiagonal matrices Numerics	16	333	March 17, 2025
Best way to construct 2-D Laplacian banded matrix: BandedMatrices, sparse, or BlockBandedMatrices? General Usage package	22	4359	July 2, 2021
Matrix type for inversion Numerics linearalgebra , sparse	2	280	December 28, 2023
Solvers for Block-Tridiagonal Numerics linearalgebra	14	266	June 24, 2025

Thomas algorithm for banded matrices

Related topics