CUDA aware MPI works on system but not for Julia

That is good, the first issue - downloading of CUDA artifacts - is solved.

For the second issue - the error message - I assume that you know how to set the required variables for CUDA-aware MPI in general as you say it works with C++…

So, try maybe this in order to see if it is related to the functionality you are using in you all-to-all example:

using MPI
using CUDA
MPI.Init()
comm = MPI.COMM_WORLD
rank = MPI.Comm_rank(comm)
size = MPI.Comm_size(comm)
dst = mod(rank+1, size)
src = mod(rank-1, size)
println("rank=$rank, size=$size, dst=$dst, src=$src")
N = 4
send_mesg = CuArray{Float64}(undef, N)
recv_mesg = CuArray{Float64}(undef, N)
fill!(send_mesg, Float64(rank))
#rreq = MPI.Irecv!(recv_mesg, src,  src+32, comm)
MPI.Sendrecv!(send_mesg, dst, 0, recv_mesg, src, 0, comm)
println("recv_mesg on proc $rank: $recv_mesg")

If the problem remains, set LD_PRELOAD in order to point to your libcuda.so and libcudart.so. On Piz Daint this was, e.g., done as follows: LD_PRELOAD=/usr/lib64/libcuda.so:/usr/local/cuda/lib64/libcudart.so
This was a workaround required there before Cray fixed an issue in Cray-MPICH…

3 Likes