I want to discuss a better parallel strategy for my current code.
I have a data matrix with 500,000 columns:
- step1: each column will be used (independently) in function 1, where the column will be updated at the end of function 1. So here we run function 1 independently 500,000 times.
- step2: use the updated matrix to run function2. (here we run function2 for 1 time).
My current strategy is to use MPI.jl to parallel step1 in multiple machines, where the columns of data are split into different chunks. Then the updated chunk will be sent to one machine to run step2 using the whole updated data matrix.
Is MPI the best choice? I was wondering whether there is a better parallel strategy than using MPI.jl in multiple machines. Here function 1 is time-consuming due to many matrix multiplications.
Thanks for all suggestions!
Generally speaking, MPI is a solid choice for distributed computing. Whether it is the best choice and whether other parallel programming models would be more appropriate depends on your problem, the cluster you have access to, and, most importantly, what you mean by “best”. Are you trying to improve the performance? Or usability?
If you want to parallelize across different machines, you can essentially choose between MPI.jl and the Distributed standard library. The former is established and known to scale well while the latter might be easier to use and allows for interactive workflows.
Thanks for your reply! I really agree with what you mentioned: “performance” and “usability”. Do you have any suggestions on these two sides? Do you mean the speed may be further improved using GPU, and the “usability” may be improved by using Distributed.jl ?