At first I want to say thank you for maintaining MPI.jl package.
I am using MPI inside a for-loop (for iteration=1:5000). At each iteration, all ranks will send its data to rank0 using MPI.Gatherv!
, then rank0 will send some data to all ranks using MPI.Scatterv!
My code will fail after some iterations, sometimes due to out-of-memory in rank0, sometimes in other ranks. I tested the code multiple times, and the code failed at different iteration.
I am confusing because the data size in all calculations are the same in every single iteration, then why I have out-of-memory issue?
Is there some garbage clean issue with MPI? Should I use MPI.Barrier(comm)
at the end of each iteration to wait until all ranks finished garbage clean? Could you please give me some suggestions on garbage clean in MPI?
Below is an example of my code, incluidng all used MPI functions:
if my_rank == 0
Z_all_vbuf = VBuffer(Z_all, counts)
Z_all_vbuf = VBuffer(nothing)
for iteration in 1:5000
my_Z = f1(my_Z, my_res)
MPI.Gatherv!(my_Z, Z_all_vbuf, 0, comm)
if my_rank == 0
res_all = f2(Z_all)
res_all_vbuf = VBuffer(res_all, size_all)
res_all_vbuf = VBuffer(nothing)
my_res = MPI.Scatterv!(res_all_vbuf, my_size, 0, comm)
Thank you so much,