Thomas algorithm for banded matrices

Dear All,

I am trying to solve linear systems Ax = rhs where A has a tridiagonal shape. When the type of A is tridiagonal, one can use the thomas algo to solve it. I find it faster than defining A as a Tridiagonal and calling \. Indeed, the actual implementation of \ uses a LU decomposition followed by a solve. Maybe the choice of not relying on \ for Tridiagonal is numerical stability of the algorithm.

I want to do something slighly different. I want to solve the same problem for a banded matrix with a diagonal band and two other bands. When the bandwith is 1, one recovers the case above of the Thomas algo.

I have the MWE below, but it does not work. As you can see, I first compute the LU decomposition of the matrix and then solve the linear systems. I have two issues I cannot solve :frowning:

  • Firstly despite my analytical formula, my LU decomposition seems wrong.
  • Secondly, I am able to invert L but I am unable to invert U which I am ashamed of.

If anyone has an idea, I’d be delighted.

Thank you a lot.

using LinearAlgebra, SparseArrays
function thomas2bands!(a::vec, b::vec, c::vec, rhs::vec) where {T, vec <: Vector{T}}
	@assert length(a) == length(c)
	n = length(b)
	p = length(a)
	# we first derive the LU decomposition
	# we have L and U written 8 lines below, L = [l, 1] and U = [v, c]
	v = copy(b)
	l = a ./ v[1:p]
	for ii = n-p+1:n
		v[ii] = b[ii] - l[ii-n+p] * c[ii-n+p]
	end
	# up to here, this has been check, the LU decomposition seems OK
	@show n-p
	L = spdiagm(-n+p => l, 0 => fill(1.0, n))
	U = spdiagm(0 => v, n-p => c)
	A = spdiagm(-n+p => a, 0 => b, n-p => c)
	println("-> A - LU = ", norm(L * U - A, Inf64))

	# we want to solve LUx = rhs
	# we first solve Ly = rhs
	y = similar(rhs)
	for ii=1:n-p
		y[ii] = rhs[ii]
	end
	for ii=n-p+1:n
		y[ii] = rhs[ii] - l[ii-n+p] * y[ii-n+p]
	end
	println("-> L \\ rhs = ",norm(y -  (L \ rhs), Inf64))

	# we solve Ux = y
	x = similar(rhs)
	for ii = n-p+1:n
		x[ii] = y[ii] / v[ii]
	end
	for ii=1:n-p
		x[ii] = y[ii] / v[ii] - x[ii+n-p] * c[ii] / v[ii]
	end
	println("-> y - U * x   = ",norm((y -  (U * x)), Inf64))
	println("-> U \\ y   = ",norm(x -  (U \ y),Inf64))
	return x
end
	sol = @time thomas2bands!(rand(10^5-10^3), 2 .+ rand(10^5), rand(10^5-10^3), rand(10^5) )

1 Like

I think the problem is that in the analytical case you consider to build the full formula, you assume that there is no column where both off diagonal terms are non-zero. For instance consider the 3d column in

Thank you for your reply! When I multiply your matrices, I get

 1  0  1  0  0
 0  1  0  1  0
 1  0  2  0  1
 0  1  0  2  0
 0  0  1  0  2

It seems it has the correct sparsity pattern. Did you look at my Maple document?

Ha sorry I was checking what I wrote on a piece of paper as you typed your response and realised that’s not it (sorry) and yes I did look at the maple document I thought it was missing a case and that it could explain the problem.

Edit: there still seems to be a problem with the lower diagonal. When doing

sol, R = @time thomas2bands!(rand(3), 2 .+ rand(5), rand(3), rand(5) )

where R is Matrix(A-L*U) I get

5×5 Array{Float64,2}:
 0.0  0.0  0.0        0.0  0.0
 0.0  0.0  0.0        0.0  0.0
 0.0  0.0  0.0        0.0  0.0
 0.0  0.0  0.0        0.0  0.0
 0.0  0.0  0.0190783  0.0  0.0

however when going with 6, 3 then it’s fine. with 7 and 5 the off diagonal is wrong for the last 3 elements.

I agree!! That’s my first mistake I can’t solve. This is strange because the v part seems fine despite involving l which seems to be wrong…

You can check this by doing lu(A), the sparsity is the same as I assumed for L and U…

BTW: I updated the document with the formula.

Actualy that’s just what I did and it doesn’t seem to be the case? (this is for 7/5)

julia> L
6×6 Array{Float64,2}:
 1.0       0.0       0.0  0.0  0.0  0.0
 0.0       1.0       0.0  0.0  0.0  0.0
 0.0       0.0       1.0  0.0  0.0  0.0
 0.0       0.0       0.0  1.0  0.0  0.0
 0.823648  0.0       0.0  0.0  1.0  0.0
 0.0       0.910357  0.0  0.0  0.0  1.0

julia> luL
7×7 Array{Float64,2}:
 1.0       0.0       0.0        0.0       0.0      0.0  0.0
 0.0       1.0       0.0        0.0       0.0      0.0  0.0
 0.220607  0.0       1.0        0.0       0.0      0.0  0.0
 0.0       0.209361  0.0        1.0       0.0      0.0  0.0
 0.0       0.0       0.0931903  0.0       1.0      0.0  0.0
 0.0       0.0       0.0        0.324195  0.0      1.0  0.0
 0.0       0.0       0.0        0.0       0.06044  0.0  1.0

julia> U
6×6 Array{Float64,2}:
 0.164566  0.0       0.0      0.0       0.361828   0.0      
 0.0       0.177329  0.0      0.0       0.0        0.973216 
 0.0       0.0       0.27888  0.0       0.0        0.0      
 0.0       0.0       0.0      0.203477  0.0        0.0      
 0.0       0.0       0.0      0.0       0.0423017  0.0      
 0.0       0.0       0.0      0.0       0.0        0.0682693

julia> luU
7×7 Array{Float64,2}:
 2.65545  0.0      0.353129  0.0       0.0       0.0       0.0      
 0.0      2.57589  0.0       0.767602  0.0       0.0       0.0      
 0.0      0.0      2.79038   0.0       0.043141  0.0       0.0      
 0.0      0.0      0.0       2.80709   0.0       0.267985  0.0      
 0.0      0.0      0.0       0.0       2.76367   0.0       0.0668464
 0.0      0.0      0.0       0.0       0.0       2.38242   0.0      
 0.0      0.0      0.0       0.0       0.0       0.0       2.05833

Edit: hmmm the dimensions are not quite right… could that be it?

lu(Matrix(A)) I guess

Also, there is a mistake in the definition of L which I corrected

It seems it works when the bandwith is even actually which is quite puzzling. Maybe a mistake with the indices

Ok I spent quite some time on this because I thought it was fun. Here’s an algo that actually works, note that the decomposition is actually a bit trickier.

It’s just a bit annoying with the indices floating around


function decomp(A, p)
    n = size(A, 1)

    λ = zeros(p)
    μ = zeros(p)
    v = zeros(n)

    for i=1:n-p
        v[i] = A[i,i]
    end

    for j = n-p+1:n
        μ[j-n+p] = A[j-n+p, j]
    end

    for k=1:Int(floor(n/(n-p)))
        for i=(k*(n-p)+1):min((k+1)*(n-p), n)
            λ[i-n+p] = A[i,i-n+p] / v[i-n+p]
            v[i] = A[i, i] - λ[i-n+p] * μ[i-n+p]
        end
    end

    v, λ, μ
end

buildL(λ, n) = diagm(-n+length(λ) => λ) + I

buildU(v, μ, n) = diagm(0=>v, n-length(μ)=>μ)


n, p = 7, 5

v = randn(n)
λ = randn(p)
μ = randn(p)

L = buildL(λ, n)
U = buildU(v, μ, n)

A = L*U

rv, rλ, rμ = decomp(A, p)

norm(rv - v)
norm(rλ - λ)
norm(rμ - μ)

Edit: to get to that, I wrote down the equation for rows of L and columns of U and computed what their dot product looks like in general. This gave me:

function rowL(i, λ, n)
    p = length(λ)
    ri = zeros(n)
    for j = 1:n
        ri[j] = (j==i ? 1.0 : 0.0) +
                   ((j == i-n+p) ? λ[j] : 0.0)
    end
    ri
end

function colU(j, v, μ)
    n  = length(v)
    p  = length(μ)
    cj = zeros(n)
    for i = 1:n
        cj[i] = (i==j ? v[i] : 0.0) +
                    ((i == j-n+p) ? μ[i] : 0.0)
    end
    cj
end

function LU(i, j, v, λ, μ)
    p = length(λ)
    n = length(v)

    ri = rowL(i, λ, n)
    cj = colU(j, v, μ)

    gt = dot(ri, cj)

    hand = (i == j ? v[i] + (i ≥ n-p+1 ? λ[i-n+p]*μ[i-n+p] : 0.0) : 0.0) +
           (j == i-n+p ? λ[i-n+p]*v[i-n+p] : 0.0) +
           (i == j-n+p ? μ[i] : 0.0)

#    @show  gt-hand
    return hand
end

The expressions here are not simplified to clarify how I got an expression for a[i, j]. Once I had the LU bit, I reversed the steps and realised that there was a trick where you can only get v and lambda in blocks (as per previous algo)

Edit2 what I mean by “blocks” is that the equations allow you to get things from n-p+1 to 2n-2p and then from 2n-2p+1 to 3n-3p etc until you bottom out. But past n-p+1 you cannot in general get all remaining v and lambda in one shot.

I think the Maple document you had doesn’t show this because p is too small but if you take the n=7 and p=5 case, you can see how some of this plays out.

Edit3: it may be that the decomposition I suggest can be further simplified, I didn’t really try, I just know that it’s correct as I tried a big bunch of n and p and the relative error is always around 1e-15.

Here’s your code adjusted


using LinearAlgebra, SparseArrays
function thomas2bands!(a::vec, b::vec, c::vec, rhs::vec) where {T, vec <: Vector{T}}
	@assert length(a) == length(c)
	n = length(b)
	p = length(a)
	# we first derive the LU decomposition
	# we have L and U written 8 lines below, L = [l, 1] and U = [v, c]

	λ = zeros(p)
    μ = zeros(p)
    v = zeros(n)

    for i=1:n-p
        v[i] = b[i]
    end

    for k=1:Int(floor(n/(n-p)))
        for i=(k*(n-p)+1):min((k+1)*(n-p), n)
            λ[i-n+p] = a[i-n+p] / v[i-n+p]
            v[i] = b[i] - λ[i-n+p] * c[i-n+p]
        end
    end

	# up to here, this has been check, the LU decomposition seems OK
	@show n-p
	L = spdiagm(-n+p => λ, 0 => fill(1.0, n))
	U = spdiagm(0 => v, n-p => c)
	A = spdiagm(-n+p => a, 0 => b, n-p => c)
	println("-> A - LU = ", norm(L * U - A, Inf64))

	# we want to solve LUx = rhs
	# we first solve Ly = rhs
	y = similar(rhs)
	for ii=1:n-p
		y[ii] = rhs[ii]
	end
	for ii=n-p+1:n
		y[ii] = rhs[ii] - λ[ii-n+p] * y[ii-n+p]
	end
	println("-> L \\ rhs = ",norm(y -  (L \ rhs), Inf64))

	# we solve Ux = y
	x = similar(rhs)
	for ii = n-p+1:n
		x[ii] = y[ii] / v[ii]
	end
	for ii=1:n-p
		x[ii] = y[ii] / v[ii] - x[ii+n-p] * c[ii] / v[ii]
	end
	println("-> y - U * x   = ",norm((y -  (U * x)), Inf64))
	println("-> U \\ y   = ",norm(x -  (U \ y),Inf64))
	return x
end
	sol = @time thomas2bands!(rand(10^5-10^3), 2 .+ rand(10^5), rand(10^5-10^3), rand(10^5))

This gives

n - p = 1000
-> A - LU = 4.440892098500626e-16

(I didn’t check the rest)

Thank you a lot for your intense help!

I will digest this.

1 Like

Hey

Sorry for coming back late. The following work for me

function thomas2bands!(a, b, c, d; debug = false)
	@assert length(a) == length(c)
	@assert length(b) == length(d)
	debug && println("#"^20)
	n = length(b)
	p = length(a)
	# we first derive the LU decomposition
	# we have L and U written 8 lines below, L = [l, 1] and U = [v, c]
	v = similar(b)
	l = similar(a)
	for i=1:n-p
		v[i] = b[i]
	end

	for i=n-p+1:n
		l[i-n+p] = a[i-n+p] / v[i-n+p]
		v[i] = b[i] - l[i-n+p] * c[i-n+p]
	end
	# up to here, this has been check, the LU decomposition seems OK

	debug && (L = spdiagm(-n+p => l, 0 => fill(1.0, n)))
	debug && (U = spdiagm(0 => v, n-p => c))
	debug && (A = spdiagm(-n+p => a, 0 => b, n-p => c))
	debug && println("-> A - LU = ", norm(L * U - A, Inf64))
end