Solving A\b in parallel


#1

I have an access to a large shared-memory machine. It has memory in the order of TB, and has 64 CPU cores.

If I want to perform A\b for a square matrix A and column vector b on this shared-memory machine using multiple CPU cores, what is the workflow? A is stored in Julia’s sparse matrix format.


#2

Did A\b not work? What kind of matrix is A? How sparse is it? Is it symmetric positive definite? Non-symmetric?


#3

A\b works, but how do we make sure it runs on N CPU cores?


#4

Check htop. You’ll see that it’s already multithreaded.


#5

So you mean I don’t need to do something like julia -np N scriptname.jl to make A\b in scriptname.jl use N CPU cores?


#6

In fact, don’t do that. Adding processes sets the number of BLAS threads to 1, which is what you don’t want.


#7

You can also try out https://github.com/JuliaSparse/Pardiso.jl/. I’ve gotten significantly higher performance with it than with the solvers coming with julia on high core machines.


#8

To clarify what is only really implied here, and please correct me if I’m wrong: linear algebra ops are automatically parallel if you have BLAS set up right (run versioninfo() and look for libopenblas). BLAS operations use a different kind of parallel computing than Julia native parallel as described in the parallel doc


#9

Solving a sparse linear system will not use BLAS directly, although it is likely used by the sparse solver.


#10

@kristoffer.carlsson, I am testing Pardiso.jl, but the performance enhancement with the number of CPU cores is a bit disappointing. The same problem is solved in about 40 seconds with 2 cores, and about 30 seconds with 16 cores. Is this typical, or you think something is wrong?


#11

How big is the matrix?


#12

It is million by million.

Also, do you have an example with set_solver!(ps, ITERATIVE_SOLVER)? I am solving a system with slowly evolving matrix, so I would like to perform factorization only once by using ITERATIVE_SOLVER, but not sure where to call this. Don’t see a good example in the PARDISO documentation, either.