ANN: MPI.jl v0.10.0: new build process and CUDA-aware support

I have just tagged a new version of MPI.jl. Though the user-facing interface is largely the same, there has been extensive work underneath to internally use the C API (instead of the Fortran one). As a result, the build process is much simpler (it no longer requires CMake or a Fortran compiler). Additionally, it also directly supports CUDA-aware MPI libraries, allowing CuArrays to be passed directly as buffers (thanks to Seyoon Ko).

I would greatly appreciate if people are able to try it out, especially with different clusters and MPI implementations.

22 Likes

Is it worth sharing this on the OpenMPI mailing list? I would guess you work closely with those guys anyway.

I just noticed this is in Julia at Scale!
Picture me doing a happy dance. Or maybe my Miata doing doughnuts (see avatar image).

1 Like

Thank you @simonbyrne! Here are some code examples using the new version of MPI.jl.

3 Likes

Basic send Send, Recv! works fine on an ordinary Linux cluster with OpenMPI. Thanks!

1 Like

Nice! Is there a way to partition a big CuArray into p parts like the slab decomposition?

I’m working on it based on barche/MPIArrays.jl. I think it can be released some time this year.

1 Like

You may be interested in the python lib.

1 Like

@rveltz This looks interesting. Thanks for the reference!

I have access to a DGX-1 GPU system. I’m not sure how much time I can get on it.
If there are any tests which could be run for this package I could give it a try.

Do you know if it has a CUDA-aware MPI? If so, would be good to run the test suite with JULIA_PROJECT set to [pkgdir]/test/cudaenv.

@kose-y and @simonbyrne, a big thanks for making CUDA-aware MPI available to the Julia community! Unfortunately, when I tried CUDA-aware MPI with MPI.jl on two different systems, it failed in both cases. Could you have a look at this post where I reported the errors? It would be fantastic if I got it to work before the AGU conference this weekend… :slight_smile: