Question about CUDA-aware MPI

marius311 · April 13, 2020, 4:29am

I’d like to make sure I understand how Julia hooks in to CUDA-aware MPI. Basically, is it true that when send/receiving Julia objects which may have a CuArray somewhere in them, that the CuArray is never moved to the CPU, and instead passed from GPU to GPU?

Here’s an example script:

using MPI
MPI.Init()

using CuArrays
using CUDAdrv
using CUDAnative

comm = MPI.COMM_WORLD
rank = MPI.Comm_rank(comm)
device!(rank)

@info "MPI process $rank is using $(device())"

if rank == 0

    dat = (x = cu(ones(4,4)),)
    MPI.send(dat, 1, 0, comm)

else     

    dat, = MPI.recv(0, 0, comm)
    @show dat

end

When I mpiexec -n 2 this script on a machine with 2 GPUs, does the CuArray inside dat ever have to go through CPU? Is there a way to check? Thanks.

vchuravy · April 17, 2020, 2:25pm

There is no easy way to check :), but the fact that your code runs without crashing is a good indication.
I am not sure we are handling the situation where a CuArray is wrapped in a tuple right now…,

The current version of MPI.jl has a MPI.has_cuda() function with which you can check the availability of CUDA-aware MPI.

A good indication is to use the nvidia system profiler with the MPI integration and check whether or not you are seeing unnecessary copies to the host.

Topic		Replies	Views
ANN: MPI.jl v0.10.0: new build process and CUDA-aware support Julia at Scale	26	2740	October 31, 2020
Using views of CuArray with CUDA-aware MPI is extremely slow GPU question , bug , cuda , mpi	14	337	August 5, 2024
CUDA aware MPI fails but runs on multiple GPUs Julia at Scale	5	780	July 21, 2021
Error/segfault in basic test of CUDA-aware MPI Julia at Scale question	10	1420	November 6, 2020
CUDA aware MPI works on system but not for Julia Julia at Scale parallel , mpi	30	3011	January 24, 2022

Question about CUDA-aware MPI

Related topics