What is the real @time for an MPI program?

parallel
#1

Hi, I have a question on timing an MPI code, because I want to test its performance.

For example, if I split my code into 10 parallel jobs using MPI, then the @time will return 10 timing records. BTW, I also try to use start = time(); ... ;elapsed = time() - start , but it still return 10 records as @time .

Is the real time for these 10 jobs the maximum, or the sum?

eg.

4.166182 seconds (436.84 k allocations: 21.515 MiB, 0.08% gc time)
3.853497 seconds (436.84 k allocations: 21.515 MiB, 0.09% gc time)
4.091061 seconds (436.84 k allocations: 21.515 MiB, 0.08% gc time)
4.169676 seconds (436.84 k allocations: 21.515 MiB, 0.08% gc time)
4.114676 seconds (436.84 k allocations: 21.515 MiB, 0.08% gc time)
0.705804 seconds (436.84 k allocations: 21.515 MiB, 0.50% gc time)
4.054897 seconds (436.84 k allocations: 21.515 MiB, 0.09% gc time)
3.813533 seconds (436.84 k allocations: 21.515 MiB, 0.09% gc time)
4.148482 seconds (436.84 k allocations: 21.515 MiB, 0.08% gc time)
4.012410 seconds (436.84 k allocations: 21.515 MiB, 0.09% gc time)

Thanks!

#2

I’m not clear on the precisely how you are generating the output shown (a minimal example, or even some pseudo-code would help), but if you have a code like this:

# file tmp.jl
using MPI

@time begin
  MPI.Init()
   # do computation here
  MPI.finalize()
end

And then run mpirun -np 10 julia ./tmp.jl, then each of the 10 MPI processes will print the results of @time for that process (@time will not do any summing or averaging or anything like that).

The question of what is the “real” time does not have a well defined answer because the processes do not necessarily finish at the same time. You could say the “real” time is the time it takes the longest process to finish. You could also say the “real” time is the length of time from when the first process is created to when the last one finishes (keeping in mind that these might be different processes).

#3

Hi, I write a MPI dot prodect function. For example

function mpi_dot(a,b,n)
    MPI.Init()
    .......
    allsum = MPI.Reduce(local_sum,MPI.SUM, 0, comm)
    MPI.finalize()
end

And I want to test its speed by running @time mpi_dot(a,b,n). Then I can compare it with non-parallel dot product.

But now I have multiple time records, and I don’t know which time I can use to do the comparison.
Can I just the maximum time record?

#4

Because the Reduce function is returning the result only on process 0, I would use the time from process 0.

#5

Hi, I tried to print the rank id to identify the order of each process, then I can know which result is for process 0.

What surprise me is that the time of process 0 is not the maximum result among all time results.
Could you please tell me why? Then I should still use the time for process 0?

Thanks

#6

I’m not sure why. It can depend on the order in which mpirun launches processes, which is not defined by the MPI standard.

I think the time on process 0 is a reasonable choice. The max time would also be a reasonable choice. I’m not sure one is significantly better than the other.

1 Like