Julia’s default multi-processing system, Distributed
, is based on a different paradigm than the de facto standard MPI
in HPC.
I’m really wondering the numbers of people using the two different types of distributed computation. If you use both of them frequently, please select both.
I also appreciate if people can share their experience with these 2 different architectures.
4 Likes
Distributed has a big disadvantage: slow communication between processes. MPI implementations typically optimize that a lot.
2 Likes
Does DistributedNext improve this?
1 Like
I am curious how slow the communication is. Do you have some numbers?
1 Like
I’m afraid I don’t have any up-to-date numbers to support my claim. I did some testing two or three years ago, and it was no contest: MPI won handily. However, YMMV. It depends on the granularity of the computation. Little communication and lots of computation may well favor Distributed.
I am not familiar with DistributedNext. I’ll be curious to see what’s there.
1 Like
I use both when developing Dagger.jl, because users might have a reason for using one or the other. Some users want the dynamism and flexibility of Distributed, while others want the raw performance and scalability of MPI. Some users want both, and use MPIClusterManager, and are thus using both at the same time!
For reference, in Dagger, we provide high-level abstractions (like our DArray
, and our Datadeps parallel algorithm framework) that don’t specify one or the other, but allow users to tell Dagger which to use, while providing equivalent semantics regardless of which option is chosen. This means the problem “Does library XYZ support Distributed or MPI?” doesn’t really matter when the library supports Dagger; the difference is handled behind the scenes, and no code has to be rewritten.
(Also, we allow selecting between Distributed and DistributedNext, so regardless of which Distributed-compatible library users choose, everything “just works”)
7 Likes
Cool. Is Dagger
easy to use than MPI
?
I’d definitely say so! MPI makes you deal with details like parallel synchronization and coordination, data transfer/serialization, managing concurrency, etc. - all complicated things that even HPC experts struggle with.
Dagger (Datadeps specifically) takes the path of letting you express your program as a serial sequence of operations on a partitioned array, which Dagger then parallelizes for you based on little data dependency annotations. Dagger then handles those previously-mentioned complexities for you, efficiently utilizing MPI for data transfers and automatically scheduling your program across ranks and managing concurrency and program ordering, so that you still get the performance you expect from a well-written MPI program, without the mental overhead of having to do the MPI calls (and everything else) yourself.