Sparse solve with sparse rhs

is there any news on this? It appears it is still not possible solving problems with sparse rhs. This is a problem when dealing with very large, and very sparse, matrices, where it is not possible allocate memory for a full rhs, and solving for each column of the rhs separately it would be too slow.

1 Like

You mean sparse-matrix rhs?

1 Like

(Moved to a new thread — please don’t resurrect ancient threads.)

In general, even if you have a sparse rhs and a sparse matrix, the solution will be dense, so if you don’t have enough memory to allocate the rhs then you will be in trouble with the solution, no?

However, if you have a sparse rhs (or set of right-hand-sides) and you only want a sparse subset of the solution vectors, then there is a nice technique recently identified by a colleague of mine (Lin, Wang, & Hsu (2022)) that may give enormous speedups. (They applied it to scattering problems, but it’s quite a general linear-algebra trick.)

Suppose you are solving AX=B for some matrix B of right-hand sides, but you are only interested in a sparse “projection” Y = PX of the solutions involving some small subset of the outputs (rows P << cols P). Then, you can form the following “augmented” sparse matrix K and only do a partial LU factorization to introduce zeros in the block corresponding to -P:

K = \begin{pmatrix} A & B \\ -P & 0\end{pmatrix} \rightsquigarrow \begin{pmatrix} U & F \\ 0 & Y\end{pmatrix}

at which point the block Y (the Schur complement) is exactly the desired solution Y = PA^{-1}B. They find that this can often be vastly faster than a traditional sparse solve, mainly because it doesn’t need to do a full LU factorization of A (the matrix U can be discarded, and does not even need to be upper triangular). It also fully exploits sparsity of B.

They write that MUMPS and PARDISO already support such partial factorizations, so you could presumably do this via the Julia MUMPS.jl or Pardiso.jl packages (possibly with a little digging to access the lower-level API). I haven’t checked yet whether SuiteSparse supports this kind of partial factorization.

9 Likes

yes, the situation is the type

\mathbf{A} \mathbf{X} = \mathbf{Y}

with \mathbf{A} and \mathbf{Y} sparse matrices, where \mathbf{Y} has fewer columns than \mathbf{A} but still too many for storing the full matrix. But, as @stevengj, just pointed out in general \mathbf{X} will not be sparse even if \mathbf{A} and \mathbf{Y} are.

I have a large FE problem and I would like to solve it in chunks, but plain sectioning of the domain will not solve memory problems, on the contrary it seems it will make it worse.

thanks I will look into it, I realized just after posting the message that I was not going to be able to store the result anyway even if I could solve the linear system.

Apparently @miles.lubin used a similar trick in a 2014 paper (equations 3.11–3.13), where they also commented that it’s surprisingly little known in the context of direct solves with sparse inputs/outputs.

2 Likes

Wade Hsu tells me that SuiteSparse probably cannot do this efficiently. His group currently uses MUMPS, which (because it is multifrontal) is optimized for this sort of thing—in fact, both MUMPS and PARDISO can return the desired Schur complement Y without even storing the other factors U and F. See the mumps_schur_complement function in MUMPS.jl.

1 Like

Isn’t this equivalent to non-backward-stable Gauss-Jordan?

No.

Well that’s a short answer. :slight_smile: I disagree, and I have sample code to show it:

using LinearAlgebra
using Random
Random.seed!(0)

n = 5

A = rand(n,n)
b = rand(n)


# Gauss-Jordan

U = copy(A)
w = copy(b)
for k = 1:n
	ind = vcat(1:k-1, k+1:n)
	mult = U[ind,k] / U[k,k]
	U[ind,:] = U[ind,:] - mult * U[k,:]'
	w[ind] = w[ind] - mult*w[k]
	# diagonal scaling of the pivot row (this is usually done at the end, but it is equivalent to do it now)
	U[k,:] = U[k,:] / U[k,k]
	w[k,:] = w[k,:] / U[k,k]
	display(U)
	display(mult)
end

display(norm(A*x - b) / norm(b))

# OP's method with P=I (one can omit rows in the second half of the matrix to change P)

K = [A b; -I zeros(n)]
U = copy(K)
for k = 1:n
	mult = U[k+1:end,k] / U[k,k]
	U[k+1:end,:] = U[k+1:end,:] - mult * U[k, :]'
	display(U[k+1:n+k, 1:end-1])
	display(mult[1:n-1])
end
x2 = U[n+1:end, end]
display(norm(A*x2 - b) / norm(b))

One can see that the two methods compute the exact same multipliers and entries in the elimination matrix, just with the rows in a different order.

The stability of Gauss–Jordan elimination is studied in detail, for instance, in Higham’s Accuracy and stability of numerical algorithms, Section 14.4. The algorithm is forward stable, but not backward stable.
The reason for this instability, informally, is that, even after pivoting, you have no control over the magnitude of the multipliers that create the lower half of K during the elimination.

You can use (unstable) unpivoted elimination to compute the Schur complement, but you don’t have to. You are free to use as many row permutations as you want within the A block.

(A is the matrix you would be solving with in the original form of the problem, so it’s exactly equivalent to solves using A alone. And of course, sparse-direct solvers do use pivoting.)

The point here, by the way, is not the Schur complements are a new idea — this is indeed first-semester linear-algebra stuff! — but rather that people don’t seem to have appreciated that this is a powerful way to exploit sparse outputs and right-hand sides in multifrontal sparse direct solvers.

Sorry, I made a terrible job at explaining myself. That code was not meant to display a numerical problem; it was meant to display that the quantities you compute in that algorithm are the same as in GJ. I left off pivoting since it wasn’t the main point.

My point is that this method is not equivalent to a system solve using LU factorization; you make the same LU steps, but after those you do something different than backward substitution to complete the algorithm, and this ‘something different’ is forward but not backward stable. Of course you may decide that this is sufficient stability for your large-scale applications (especially since you don’t compute all entries of the solution, so you can’t even form the residual Ax-b to check if it is small), but it is an interesting remark.

You don’t (need to) do the “ upward” elimination steps you would do with GJ. Instead, you can just do (partial) LU with a few more rows.

BasicLU.jl supports sparse right-hand side for information.

m = 1000
F = basiclu_object(m)
A = sprand(m, m, 5e-3) + I
err = factorize(F, A)
b = sparsevec([1], [1.0], m)
x = solve(F, b, 'N')  # sparse solution

Référence: L. Schork and J. Gondzio, Permuting Spiked Matrices to Triangular Form and its Application to the Forrest-Tomlin Update