ANN: MPI.jl v0.10.0: new build process and CUDA-aware support

I’m working on it based on barche/MPIArrays.jl. I think it can be released some time this year.

1 Like

You may be interested in the python lib.

1 Like

@rveltz This looks interesting. Thanks for the reference!

I have access to a DGX-1 GPU system. I’m not sure how much time I can get on it.
If there are any tests which could be run for this package I could give it a try.

Do you know if it has a CUDA-aware MPI? If so, would be good to run the test suite with JULIA_PROJECT set to [pkgdir]/test/cudaenv.

@kose-y and @simonbyrne, a big thanks for making CUDA-aware MPI available to the Julia community! Unfortunately, when I tried CUDA-aware MPI with MPI.jl on two different systems, it failed in both cases. Could you have a look at this post where I reported the errors? It would be fantastic if I got it to work before the AGU conference this weekend… :slight_smile:

Hi,

Is there any news about Distributed Arrays with MPI, or mixing CuArrays with MPI?

You will need to build and link against a CUDA-aware MPI implementation, but other than that, CuArrays should work with MPI.jl:
https://juliaparallel.github.io/MPI.jl/stable/usage/#CUDA-aware-MPI-support-1

There is a proof-of-concept package of distributed arrays:


but other than that, no. It really depends on what sort of functionality you will want, e.g. we’ve built our own to provide support for ghost elements (https://github.com/CliMA/ClimateMachine.jl/blob/master/src/Arrays/MPIStateArrays.jl), but it makes a lot of assumptions about data layout, etc.

I see. Ideally, I want to perform ffts on multiGPU…

There is some discussion here: https://github.com/jipolanco/PencilFFTs.jl/issues/3

yes! That is exactly what I am looking for

As mentioned in the issue, it would be great to add GPU support to PencilFFTs. The MPI-heavy part of the code was recently refactored, and hopefully, extending things to work with CuArrays should not take too much work.

I personally don’t have any experience with CUDA-aware MPI, and I don’t have access to multi-GPU systems (that I’m aware of), so any help with this is most welcome!

I have just released it here.

3 Likes

This is really cool stuff. I am wondering if you should not rename it MPIArraysV2.jl :sweat_smile:

It would be a shame to have this work “duplicated”. I see there is already PencilArrays as well.

Wow that’s awesome, that might be exactly what I need. Agreed, it seems like this implements a superset of MPIArrays and that package looks abandoned, so can this one take over?

1 Like

@kose-y I have a few questions / feature requests:

  • Are nested MPIArrays supported/planned? I’m thinking of an array, distributed along a communicator, each element of which is again an MPIArray along another communicator.
  • Are arrays distributed along two dimensions supported/planned?
  • Is indexing supported, or is it intentionally disabled?
1 Like

Wow. Super cool to see all these updates on the distributed array and FFT support. Thanks to everyone involved! I can only agree: It would be super cool to the various efforts under the same hood.

Wow. That leads me to ask about topology awarness of the cluster network.
The cluster scheduler will have topology knowledge - ie which compute nodes are close to each other on the network. In a simple fat tree those are groups on nodes on the same switch.

So could we have those communicators launch the other communicators to nodes on the same switch?

ps. I am not limiting the discussion to one topology

That is theoretically the point of the MPI topology functions, some of which are currently available in MPI.jl (as always, PRs are welcome if you would like to add more functionality).

1 Like