Inplace MPI_Allreduce give wrong results. Solved: better doc is needed

jsjie · January 3, 2025, 2:16am

A MWE:

using MPI

a = [1.0, 2.0, 3.0, 4.0]
MPI.Init()
MPI.API.MPI_Allreduce(MPI.API.MPI_IN_PLACE, a, 4, MPI.Datatype(Float64), MPI.Op(+, Float64), MPI.COMM_WORLD)
MPI.Finalize()
println(a)

which outputs

[NaN, 1.386670151907536e-309, 1.38666778029815e-309, 1.38667015182163e-309]

A non-inplace one runs correctly:

using MPI

a = [1.0, 2.0, 3.0, 4.0]
b = zeros(4)
MPI.Init()
MPI.API.MPI_Allreduce(a, b, 4, MPI.Datatype(Float64), MPI.Op(+, Float64), MPI.COMM_WORLD)
MPI.Finalize()
println(a)
println(b)

which outputs

[1.0, 2.0, 3.0, 4.0]
[2.0, 4.0, 6.0, 8.0]

The first two arguments are written according to openmpi manual, MPI_IN_PLACE for sendbuf, and the real send-and-receive buffer for recvbuf.

I’ve found the solution. Change MPI.API.MPI_IN_PLACE to MPI.IN_PLACE solve the problem. However, since I’m directly calling a lower function through MPI.API, it surprises me that I need to use a constant from MPI instead of MPI.API.

carstenbauer · January 3, 2025, 6:38am

Not at a computer, so can’t test. But isn’t MPI.API.MPI_IN_PLACE a ref? Try using MPI.API.MPI_IN_PLACE[].

jsjie · January 3, 2025, 6:59am

That also works. I examine the source codes and found the following:

const IN_PLACE = InPlace()

and

struct InPlace
end
Base.cconvert(::Type{MPIPtr}, ::InPlace) = API.MPI_IN_PLACE[]

So they are effectively the same.

Topic		Replies	Views
Pointers of Float to be used in MPI? Julia at Scale mpi	0	47	April 17, 2025
[ANN] MPIMapReduce.jl: A simplified mapreduce using MPI Package Announcements mpi , distributed	0	521	April 5, 2021
MPI.jl memory issue in a for-loop Julia at Scale mpi	3	603	January 2, 2023
Pmap with in place functions Julia at Scale	10	2137	May 15, 2018
`MPI.jl` RMA function's usage New to Julia question , package , mpi , parallel-computing	11	146	June 7, 2024

Inplace MPI_Allreduce give wrong results. Solved: better doc is needed

Related topics