I have just tagged a new version of MPI.jl. Though the user-facing interface is largely the same, there has been extensive work underneath to internally use the C API (instead of the Fortran one). As a result, the build process is much simpler (it no longer requires CMake or a Fortran compiler). Additionally, it also directly supports CUDA-aware MPI libraries, allowing CuArrays to be passed directly as buffers (thanks to Seyoon Ko).
I would greatly appreciate if people are able to try it out, especially with different clusters and MPI implementations.
I have access to a DGX-1 GPU system. I’m not sure how much time I can get on it.
If there are any tests which could be run for this package I could give it a try.
@kose-y and @simonbyrne, a big thanks for making CUDA-aware MPI available to the Julia community! Unfortunately, when I tried CUDA-aware MPI with MPI.jl on two different systems, it failed in both cases. Could you have a look at this post where I reported the errors? It would be fantastic if I got it to work before the AGU conference this weekend…
As mentioned in the issue, it would be great to add GPU support to PencilFFTs. The MPI-heavy part of the code was recently refactored, and hopefully, extending things to work with CuArrays should not take too much work.
I personally don’t have any experience with CUDA-aware MPI, and I don’t have access to multi-GPU systems (that I’m aware of), so any help with this is most welcome!