MPI RMA with CUDA

I’m interested in using MPI RMA (i.e. one-sided operations) with native CuArrays.

From the MPI.jl documentation, we read:

[CuArrays] may also work with one-sided operations, but these are not often supported

Could someone elaborate as to why they are “often” not supported? Is there a flavor of MPI, or a particular version of CUDA that must be satisfied in order for this to work reliably within Julia?

Thanks!

I think the short answer is that this was written a couple of years ago. I think if you have a recent version of UCX + OpenMPI one-sided operations “should work”.

Good to hear, thanks!