How to solve this Ax=b faster?

Minimum-Pollution-96 · May 26, 2022, 5:34am

I am trying to solve a sparse stiffness matrix (positive definite) of size 1986360x1986360 SparseMatrixCSC{Float64, Int64} with 18327153 stored entries.
Here’s a sample code.

using SparseArrays
using LinearAlgebra

A = sprandn(1986360, 1986360, 2e-6).*20;
A = A+A'+100I

b = zeros(size(a,1));
b[[1:331060;].*6] .= 200;

x = A\b

The problem is that the last step takes so much time as it uses only one core (have never actually finished it). I tried running it via julia REPL normally as well as by launching julia with julia --threads=16.

I also tried solving it as x=cholesky(A)\b but that also uses only one core. It also throws SuiteSparse.CHOLMOD.CHOLMODException("out of memory") after sometime even though 23 gb of memory(out of 32) is available.

Is there anyway to make julia use all the cores? Also, I am trying solve such a large matrix for the first time. So, please do suggest if there is something I can do to solve this faster.

Jae-Mo_Lihm · May 26, 2022, 8:19am

Have you tried iterative solvers?

There is Krylov.jl and many others.

https://juliasmoothoptimizers.github.io/Krylov.jl/stable/solvers/spd/

greatpet · May 26, 2022, 8:29am

Random sparse systems should be solved by iterative solvers, since the default direct solver causes fill-in during Gaussian elimination and blows up the memory usage.

DNF · May 26, 2022, 8:35am

Side note: Don’t do this. It allocates an array - no! two arrays - needlessly. It is of course not an effect of any significance here, but it’s a good idea to mind unnecessary allocations in general.

Instead, do (1:331060) .* 6 or 6:6:(331060*6).

Minimum-Pollution-96 · May 26, 2022, 9:01am

Thank you! I’ll check this out.

Minimum-Pollution-96 · May 26, 2022, 9:01am

Thanks. Noted.

Minimum-Pollution-96 · May 26, 2022, 9:05am

Could you explain this a little more? I thought that sparse systems have separate solvers that act on them. Sparse Linear Algebra · The Julia Language
But I do face out of memory error while trying cholesky(A).

greatpet · May 26, 2022, 9:09am

Sparse matrices encountered in many real-world problems have some structures, which makes it possible to do Gaussian elimination without turning the matrix into a dense matrix in intermediate steps (called “fill-in”). If your sparse matrix is completely random, then Gaussian elimination doesn’t work well, and you should switch to a package that implements iterative methods which are insensitive to the fill-in problem.

Andrea_Vigliotti · May 26, 2022, 9:57am

you might want to look at this reply by @stevengj, I haven’t tried it yet myself, but he mentions two parallel solvers MUMPS.jl and Pardiso.jl that might be of help

ChrisRackauckas · May 26, 2022, 12:32pm

Just use LinearSolve.jl

http://linearsolve.sciml.ai/dev/

And then run the gambit.

for alg in [KrylovJL_GMRES(), KLUFactorization(), KrylovKitJL_GMRES(), IterativeSolversJL_GMRES(), UMFPACKFactorization(), MKLPardisoFactorize(), MKLPardisoIterate()]
  sol = solve(prob,alg)
end

Likely MKLPardiso is best without a preconditioner. For the GMRES ones, you may want to define a preconditioner like:

http://linearsolve.sciml.ai/dev/basics/Preconditioners/

Minimum-Pollution-96 · May 27, 2022, 10:17am

It seems like Pardiso is not free for academic users anymore. Lets me sign up but received a mail saying academic license discontinued from Jan 2022.

ChrisRackauckas:

Just use LinearSolve.jl

http://linearsolve.sciml.ai/dev/

And then run the gambit.
for alg in [KrylovJL_GMRES(), KLUFactorization(), KrylovKitJL_GMRES(), IterativeSolversJL_GMRES(), UMFPACKFactorization(), MKLPardisoFactorize(), MKLPardisoIterate()]
  sol = solve(prob,alg)
end
Likely MKLPardiso is best without a preconditioner. For the GMRES ones, you may want to define a preconditioner like:

http://linearsolve.sciml.ai/dev/basics/Preconditioners/

Thank you. I was able to solve it with Krylov. But I’ll check this out as well!

photor · May 27, 2022, 10:57am

Learned an important point from you and @Jae-Mo_Lihm

josuagrw · May 27, 2022, 11:18am

If this problem is occuring as part of an optimization problem, I can recommend COSMO.jl

Andrea_Vigliotti · May 28, 2022, 7:52am

I tried LinearSolve on a 3D Finite Element Problem with 68’170 elements and 82’424 nodes of the type

\mathbf{A}\mathbf{u} = \mathbf{r}

where \mathbf{A} is a sparse SPD matrix with 162’372 rows with 10’889’190 non zero elements (on average 67 non zero entries per row). I tried the following

using using LinearSolve
prob = LinearProblem(A,r);

for alg in [KrylovJL_GMRES(), KLUFactorization(), KrylovKitJL_GMRES(), IterativeSolversJL_GMRES(), UMFPACKFactorization()]
  @time sol = LinearSolve.solve(prob,alg)
end

180.594400 seconds (2.56 M allocations: 2.356 GiB, 3.02% gc time, 0.45% compilation time)
217.641759 seconds (4.28 M allocations: 3.479 GiB, 0.43% gc time)
181.348235 seconds (244.91 k allocations: 2.233 GiB, 3.01% gc time, 0.05% compilation time)
954.111518 seconds (4.03 M allocations: 450.389 MiB, 0.05% gc time, 0.17% compilation time)
 13.303307 seconds (121.40 k allocations: 2.719 GiB, 5.24% gc time, 0.96% compilation time)

the MKL solvers gave and UndefVarError so I couldnt try them.
I also tried Krylov.jl, and got

using Krylov
@time x, stats = Krylov.cg(A,r)

 80.857184 seconds (1.28 M allocations: 77.709 MiB, 0.88% compilation time)

if now I compare these performances with the standard LinearAlgebra.jl package, this is what I get

@time A\r;

  3.062161 seconds (661 allocations: 1.390 GiB, 1.07% gc time, 0.13% compilation time)

it seems that the LinearAlgebra solver, on this problem, outperforms all the others, should that be the case or am I not getting the best speed from the other solvers for some reason? or is it the size of the problem that influences the performances.

(please, do repost this as a new thread if think it is appropriate)

ChrisRackauckas · May 28, 2022, 11:42am

And it’s not just compile time? The LinearAlgebra default is just UMFPACKFactorization so it should be the same time. Do you have a good way for me to reproduce this?

For the Krylov methods, did you try adding any preconditioners? Algebraic multigrid is probably a good idea here.

Andrea_Vigliotti · May 28, 2022, 4:07pm

The matrix \mathbf{A} is the result of a FE mesh with hexahedral elements of a solid mechanics problem, with 3 degrees of freedom (DoFs) per node. I have Dirichlet boundary conditions on the boundary and free internal nodes, thus the problem is constructed as follows

\mathbf{A} = \mathbf{Kt}[\textrm{iDoFs},\textrm{iDoFs}]\\ \mathbf{r} = \mathbf{Kt}[\textrm{iDoFs},\textrm{bDoFs}]\mathbf{s}[\textrm{bDoFs}]

where \mathbf{Kt} is the stiffness matrix of the entire domain, \textrm{bDoFs} are the indices of the boundary DoFs and \textrm{iDoFs} are the indices of the internal DoFs, \mathbf{s} is the entire DoFs vector. \mathbf{A} is Hermitian, with size(A)=(162372, 162372)

Depending on the position of the element, whether internal or on the boundary, each node can be surrounded from 1 up to 8 elements, which decides the number of of non zero entries per each row of the matrix, all diagonal entries are non zero. It is a quite structured matrix, which is difficult to obtain by random generation, if it helps can share the matrix and the vectors as JLD2 files.

This is what I tried

u = @btime  A\r
@show  norm(r-A*u);

  3.843 s (52 allocations: 1.39 GiB)
norm(r - A * u) = 5.766109468498555e-12

u = @btime lu(A)\r
@show norm(r-A*u);

  10.340 s (72 allocations: 2.39 GiB)
norm(r - A * u) = 4.227052608619862e-12

then with LinearSolve.jl

prob = LinearProblem(A,r);

sol = @btime LinearSolve.solve(prob, UMFPACKFactorization())
@show norm(r-A*sol.u);

  10.984 s (94 allocations: 2.71 GiB)
norm(r - A * sol.u) = 4.227052608619862e-12

then using precodintioners from AlgebraicMultigrid.jl with Krylov methods I get

m   = AlgebraicMultigrid.ruge_stuben(A)
p   = AlgebraicMultigrid.aspreconditioner(m)
sol = @btime LinearSolve.solve(prob, LinearSolve.KrylovKitJL_GMRES(), Pl=p)
@show norm(r-A*sol.u);

 51.581 s (603 allocations: 471.36 MiB)
norm(r - A * sol.u) = 7.902120848944424e-5

and

m   = AlgebraicMultigrid.smoothed_aggregation(A)
p   = AlgebraicMultigrid.aspreconditioner(m)
sol = @btime LinearSolve.solve(prob, LinearSolve.KrylovKitJL_GMRES(), Pl=p)
@show norm(r-A*sol.u);

 46.302 s (613 allocations: 477.56 MiB)
norm(r - A * sol.u) = 0.0001005812834686952

all of these have run on Jupyter-lab notebook with a single thread. Different runs of @btime differ of up to 1 sec.
Please do let me know if you have any question or need more information.

side note: lu and UMFPACKFactorization() terminate on the same solution:

sol = LinearSolve.solve(prob, UMFPACKFactorization())
u   = lu(A)\r
@show norm(u-sol.u);

norm(u - sol.u) = 0.0

ChrisRackauckas · May 28, 2022, 4:39pm

What LinearSolve.jl is doing is equivalent to the lu(A)\b route, which will default to UMFPACK in Base Julia, but I did not know that there was a specialization of \ that was faster. Interesting, I wonder what shortcut it takes. @rayegun could you take a look at this?

Andrea_Vigliotti · May 28, 2022, 5:15pm

this is what I get from @which A\r :

~/julia/julia-1.6.4/share/julia/stdlib/v1.6/SparseArrays/src/linalg.jl:1551

rayegun · May 28, 2022, 6:43pm

I’ll take a look

rayegun · May 28, 2022, 6:48pm

Unfortunately that’s the dispatch method, which doesn’t give much info, you said A is Hermitian?

Can you send me the JLD2 file? If you need to send it privately you can transmit it to kimmerer@mit.edu somehow.

It looks like it’s calling CHOLMOD, by dispatching on ishermitian. Should we duplicate this codepath @ChrisRackauckas? I’m not even sure if we wrap CHOLMOD very well in LinearSolve.jl yet. julia-1.7.2+0~x64/share/julia/stdlib/v1.7/SuiteSparse/src/cholmod.jl:1573. I am working on familiarizing myself with the CHOLMOD codebase anyway right now.

Topic		Replies	Views
What is a recommended way to solve a linear equation? New to Julia question , linearalgebra , linearsolve	31	1325	May 8, 2025
From toy to production code with sparse matrix and sparse-direct factorizations Performance sparse , cholmod	17	1162	November 1, 2022
Solving Sparse Linear Systems fast Performance sparse , linearsolve	11	5050	June 23, 2022
Linear solver \(A, B) performance vs Matlab A\b General Usage	32	7670	May 21, 2017
Sparse Cholesky of Gram Matrix (SuiteSparse?) Numerics question , linearalgebra	25	2600	April 11, 2020

How to solve this Ax=b faster?

Related topics