Parallel strategies recommendation

Carol · December 13, 2022, 12:02am

Hi,

I want to discuss a better parallel strategy for my current code.

I have a data matrix with 500,000 columns:

step1: each column will be used (independently) in function 1, where the column will be updated at the end of function 1. So here we run function 1 independently 500,000 times.
step2: use the updated matrix to run function2. (here we run function2 for 1 time).

My current strategy is to use MPI.jl to parallel step1 in multiple machines, where the columns of data are split into different chunks. Then the updated chunk will be sent to one machine to run step2 using the whole updated data matrix.

Is MPI the best choice? I was wondering whether there is a better parallel strategy than using MPI.jl in multiple machines. Here function 1 is time-consuming due to many matrix multiplications.

Thanks for all suggestions!
Carol

carstenbauer · December 13, 2022, 2:37am

Generally speaking, MPI is a solid choice for distributed computing. Whether it is the best choice and whether other parallel programming models would be more appropriate depends on your problem, the cluster you have access to, and, most importantly, what you mean by “best”. Are you trying to improve the performance? Or usability?

If you want to parallelize across different machines, you can essentially choose between MPI.jl and the Distributed standard library. The former is established and known to scale well while the latter might be easier to use and allows for interactive workflows.

Carol · December 13, 2022, 5:28pm

Thanks for your reply! I really agree with what you mentioned: “performance” and “usability”. Do you have any suggestions on these two sides? Do you mean the speed may be further improved using GPU, and the “usability” may be improved by using Distributed.jl ?

Thanks,
Carol

Topic		Replies	Views
Comparison between MPI.jl and Distributed.jl General Usage	1	551	November 17, 2021
Distributed.jl vs MPI.jl Performance question , package , mpi , distributed	26	6440	January 31, 2022
Using Multithreading and multiprocesing together to construct a matrix New to Julia multithreading	2	399	August 17, 2023
An embarrassingly parallel problem: threads or MPI? Performance parallel , multithreading , mpi	14	3853	June 10, 2021
Parallelizing a for-loop with a matrix Performance	2	340	June 2, 2021

Parallel strategies recommendation

Related topics