How to parellalizing many matrix solves

Pulpo · November 23, 2020, 12:29am

I am trying to solve many matrix equations Ax=b of fixed size. A simplified version of the code is as follows:

function ser_solve(A,b,res,nu)
    for i in(1:nu)
        res[i,:]=A[i,:,:]\b[i,:]
    end
    return res;
end

nu = 10000; si = 100;
A = randn(nu,si,si); b = randn(nu,si);
res = zeros(nu,si);

ser_solve(A,b,res,nu);

The above code is small for now, but that’s just to demonstrate the problem. The sizes of the problems will grow. How would I go about using distributed computing for this code? I have attempted to use @threads in the for loop, but the speedup is nearly non-existent. My attempts at using @distributed and shared arrays haven’t seemed to work. How would I do this the correct way? Is there possibly a way to use pmap to do this?

I should also mention that the matrices in the code I am running are actually populated by some auxiliary functions. So what would be the best way to both parallelize that population of the matrices and then solve them?

Oscar_Smith · November 23, 2020, 5:42am

Two things to note here. res[i,:]=A[i,:,:]\b[i,:] should be written as res[i,:]=@views A[i,:,:]\b[i,:] to prevent copies. Also BLAS automatically uses threading for solves, so you won’t get a ton of speedup. That said, setting BLAS threads to 1 and threading the loop should lead to speedup.

baggepinnen · November 23, 2020, 5:49am

I’m not sure that would be beneficial in this case as the memory is not contiguous. If the memory layout was changed to respect the column major storage of Julia, the views would make sense for sure.

Oscar_Smith · November 23, 2020, 6:06am

Confirming that switching the index order and adding @views gives about 2x speedup.

Pulpo · November 23, 2020, 10:42pm

Hey thanks for you response! I have actually tried setting the BLAS thread to one and with @threads it does seem to give some speedup. The issue is that eventually the size and amount of the matrix solves will grow. Meaning that both si and nu will increase. So it seems to me that I will have to distribute the computation over somehow, possibly on a cluster. I just don’t know how exactly to do that.

Pulpo · November 23, 2020, 10:44pm

These are fair points I will incorporate into the code. Thanks for that. But out of curiosity, how would you go about distributing this problem over a cluster? This is what I am looking to do.

Topic		Replies	Views
Parallelizing a for-loop with a matrix Performance	2	343	June 2, 2021
Solving A\b in parallel Julia at Scale parallel , linearalgebra	13	3882	August 21, 2021
Distributed Arrays New to Julia question	3	422	July 21, 2020
Parallell multiplication matrix and inverse iteratively julia9 New to Julia	10	780	June 3, 2021
How to distribute the solving of a linear system on multiple Procs? General Usage question , linearalgebra , distributions	0	261	June 7, 2022

How to parellalizing many matrix solves

Related topics