Poll: What do you use: Distributed or MPI?

Julia’s default multi-processing system, Distributed, is based on a different paradigm than the de facto standard MPI in HPC.

I’m really wondering the numbers of people using the two different types of distributed computation. If you use both of them frequently, please select both.

I also appreciate if people can share their experience with these 2 different architectures.

  • Distributed
  • MPI
  • Others
0 voters
4 Likes

Distributed has a big disadvantage: slow communication between processes. MPI implementations typically optimize that a lot.

2 Likes

Does DistributedNext improve this?

1 Like

I am curious how slow the communication is. Do you have some numbers?

1 Like

I use MPI because

  1. CUDA works with it (but I have to be careful with compiling PTX kernels ahead of time)
  2. HDF5 works with it, which lets me do very large-scale reads and writes very conveniently.
  3. PencilFFTs.jl and PencilArrays.jl are really excellent packages for MPI
5 Likes

I’m afraid I don’t have any up-to-date numbers to support my claim. I did some testing two or three years ago, and it was no contest: MPI won handily. However, YMMV. It depends on the granularity of the computation. Little communication and lots of computation may well favor Distributed.

I am not familiar with DistributedNext. I’ll be curious to see what’s there.

1 Like

I use both when developing Dagger.jl, because users might have a reason for using one or the other. Some users want the dynamism and flexibility of Distributed, while others want the raw performance and scalability of MPI. Some users want both, and use MPIClusterManager, and are thus using both at the same time!

For reference, in Dagger, we provide high-level abstractions (like our DArray, and our Datadeps parallel algorithm framework) that don’t specify one or the other, but allow users to tell Dagger which to use, while providing equivalent semantics regardless of which option is chosen. This means the problem “Does library XYZ support Distributed or MPI?” doesn’t really matter when the library supports Dagger; the difference is handled behind the scenes, and no code has to be rewritten.

(Also, we allow selecting between Distributed and DistributedNext, so regardless of which Distributed-compatible library users choose, everything “just works”)

7 Likes

Cool. Is Dagger easy to use than MPI?

I’d definitely say so! MPI makes you deal with details like parallel synchronization and coordination, data transfer/serialization, managing concurrency, etc. - all complicated things that even HPC experts struggle with.

Dagger (Datadeps specifically) takes the path of letting you express your program as a serial sequence of operations on a partitioned array, which Dagger then parallelizes for you based on little data dependency annotations. Dagger then handles those previously-mentioned complexities for you, efficiently utilizing MPI for data transfers and automatically scheduling your program across ranks and managing concurrency and program ordering, so that you still get the performance you expect from a well-written MPI program, without the mental overhead of having to do the MPI calls (and everything else) yourself.