It was with great horror that I discovered how non-trivial sorting N elements across P clusters without any one cluster being able to hold all elements at once was. Or maybe I’m just thick. Anyways, I only found two such open-source algorithms written in not-so-friendly C++ and Charm++ and surprisingly few papers on the subject.
MPISort.jl, a Julia package offering
mpisort! as an interface for such distributed sorting methods.
One algorithm is included at the moment: sampling with interpolated histograms, or
SIHSort (pronounced sigh sort, like anything MPI-related), optimised for minimum inter-rank communication and memory footprint. Features:
- Does not require that distributed data fits into the memory of a single node. No IO either.
- Works for any comparison-based data, with additional optimisations for numeric elements.
- Optimised for minimum MPI communication; can use Julia threads on each shared-memory node.
- The node-local arrays may have different sizes; sorting will try to balance number of elements held by each MPI rank.
- Works with any
AbstractVector, including accelerators such as GPUs (this needs further testing). Julia type-inference and optimisations do wonders.
- Implements the standard Julia
sort!API, and naturally works for custom data, comparisons, orderings, etc.
I really want to thank the MPI.jl authors for making MPI such a joy to use in Julia; here’s to hopefully more great work in the Julia distributed ecosystem!
I am looking for any feedback on the library design, code, algorithm before I register the package. All suggestions are highly appreciated.