is there any news on this? It appears it is still not possible solving problems with sparse rhs. This is a problem when dealing with very large, and very sparse, matrices, where it is not possible allocate memory for a full rhs, and solving for each column of the rhs separately it would be too slow.

You mean sparse-matrix rhs?

(Moved to a new thread â please donât resurrect ancient threads.)

In general, even if you have a sparse rhs and a sparse matrix, the solution will be dense, so if you donât have enough memory to allocate the rhs then you will be in trouble with the solution, no?

However, if you have a sparse rhs (or set of right-hand-sides) and you only want a *sparse subset* of the solution vectors, then there is a nice technique recently identified by a colleague of mine (Lin, Wang, & Hsu (2022)) that may give enormous speedups. (They applied it to scattering problems, but itâs quite a general linear-algebra trick.)

Suppose you are solving AX=B for some matrix B of right-hand sides, but you are only interested in a sparse âprojectionâ Y = PX of the solutions involving some small subset of the outputs (rows P << cols P). Then, you can form the following âaugmentedâ sparse matrix K and only do a **partial** LU factorization to introduce zeros in the block corresponding to -P:

at which point the block Y (the Schur complement) is exactly the desired solution Y = PA^{-1}B. They find that this can often be vastly faster than a traditional sparse solve, mainly because it doesnât need to do a full LU factorization of A (the matrix U can be discarded, and does not even need to be upper triangular). It also fully exploits sparsity of B.

They write that MUMPS and PARDISO already support such partial factorizations, so you could presumably do this via the Julia MUMPS.jl or Pardiso.jl packages (possibly with a little digging to access the lower-level API). I havenât checked yet whether SuiteSparse supports this kind of partial factorization.

yes, the situation is the type

\mathbf{A} \mathbf{X} = \mathbf{Y}

with \mathbf{A} and \mathbf{Y} sparse matrices, where \mathbf{Y} has fewer columns than \mathbf{A} but still too many for storing the full matrix. But, as @stevengj, just pointed out in general \mathbf{X} will not be sparse even if \mathbf{A} and \mathbf{Y} are.

I have a large FE problem and I would like to solve it in chunks, but plain sectioning of the domain will not solve memory problems, on the contrary it seems it will make it worse.

thanks I will look into it, I realized just after posting the message that I was not going to be able to store the result anyway even if I could solve the linear system.

Apparently @miles.lubin used a similar trick in a 2014 paper (equations 3.11â3.13), where they also commented that itâs surprisingly little known in the context of direct solves with sparse inputs/outputs.

Wade Hsu tells me that SuiteSparse probably cannot do this efficiently. His group currently uses MUMPS, which (because it is multifrontal) is optimized for this sort of thingâin fact, both MUMPS and PARDISO can return the desired Schur complement Y without even *storing* the other factors U and F. See the `mumps_schur_complement`

function in MUMPS.jl.

Isnât this equivalent to non-backward-stable Gauss-Jordan?

No.

Well thatâs a short answer. I disagree, and I have sample code to show it:

```
using LinearAlgebra
using Random
Random.seed!(0)
n = 5
A = rand(n,n)
b = rand(n)
# Gauss-Jordan
U = copy(A)
w = copy(b)
for k = 1:n
ind = vcat(1:k-1, k+1:n)
mult = U[ind,k] / U[k,k]
U[ind,:] = U[ind,:] - mult * U[k,:]'
w[ind] = w[ind] - mult*w[k]
# diagonal scaling of the pivot row (this is usually done at the end, but it is equivalent to do it now)
U[k,:] = U[k,:] / U[k,k]
w[k,:] = w[k,:] / U[k,k]
display(U)
display(mult)
end
display(norm(A*x - b) / norm(b))
# OP's method with P=I (one can omit rows in the second half of the matrix to change P)
K = [A b; -I zeros(n)]
U = copy(K)
for k = 1:n
mult = U[k+1:end,k] / U[k,k]
U[k+1:end,:] = U[k+1:end,:] - mult * U[k, :]'
display(U[k+1:n+k, 1:end-1])
display(mult[1:n-1])
end
x2 = U[n+1:end, end]
display(norm(A*x2 - b) / norm(b))
```

One can see that the two methods compute the exact same multipliers and entries in the elimination matrix, just with the rows in a different order.

The stability of GaussâJordan elimination is studied in detail, for instance, in Highamâs *Accuracy and stability of numerical algorithms*, Section 14.4. The algorithm is forward stable, but not backward stable.

The reason for this instability, informally, is that, even after pivoting, you have no control over the magnitude of the multipliers that create the lower half of K during the elimination.

You *can* use (unstable) unpivoted elimination to compute the Schur complement, but you donât *have* to. You are free to use as many row permutations as you want *within the A block*.

(*A* is the matrix you would be solving with in the original form of the problem, so itâs exactly equivalent to solves using *A* alone. And of course, sparse-direct solvers do use pivoting.)

The point here, by the way, is not the Schur complements are a new idea â this is indeed first-semester linear-algebra stuff! â but rather that people donât seem to have appreciated that this is a powerful way to *exploit sparse outputs and right-hand sides* in multifrontal sparse direct solvers.

Sorry, I made a terrible job at explaining myself. That code was not meant to display a numerical problem; it was meant to display that the quantities you compute in that algorithm are the same as in GJ. I left off pivoting since it wasnât the main point.

My point is that this method is *not* equivalent to a system solve using LU factorization; you make the same LU steps, but after those you do something different than backward substitution to complete the algorithm, and this âsomething differentâ is forward but not backward stable. Of course you may decide that this is sufficient stability for your large-scale applications (especially since you donât compute all entries of the solution, so you canât even form the residual Ax-b to check if it is small), but it is an interesting remark.

You donât (need to) do the â upwardâ elimination steps you would do with GJ. Instead, you can just do (partial) LU with a few more rows.

BasicLU.jl supports sparse right-hand side for information.

```
m = 1000
F = basiclu_object(m)
A = sprand(m, m, 5e-3) + I
err = factorize(F, A)
b = sparsevec([1], [1.0], m)
x = solve(F, b, 'N') # sparse solution
```