What is the real @time for an MPI program?

Carol · April 23, 2019, 11:02pm

Hi, I have a question on timing an MPI code, because I want to test its performance.

For example, if I split my code into 10 parallel jobs using MPI, then the @time will return 10 timing records. BTW, I also try to use start = time(); ... ;elapsed = time() - start , but it still return 10 records as @time .

Is the real time for these 10 jobs the maximum, or the sum?

eg.

4.166182 seconds (436.84 k allocations: 21.515 MiB, 0.08% gc time)
3.853497 seconds (436.84 k allocations: 21.515 MiB, 0.09% gc time)
4.091061 seconds (436.84 k allocations: 21.515 MiB, 0.08% gc time)
4.169676 seconds (436.84 k allocations: 21.515 MiB, 0.08% gc time)
4.114676 seconds (436.84 k allocations: 21.515 MiB, 0.08% gc time)
0.705804 seconds (436.84 k allocations: 21.515 MiB, 0.50% gc time)
4.054897 seconds (436.84 k allocations: 21.515 MiB, 0.09% gc time)
3.813533 seconds (436.84 k allocations: 21.515 MiB, 0.09% gc time)
4.148482 seconds (436.84 k allocations: 21.515 MiB, 0.08% gc time)
4.012410 seconds (436.84 k allocations: 21.515 MiB, 0.09% gc time)

Thanks!

JaredCrean2 · April 24, 2019, 12:36am

I’m not clear on the precisely how you are generating the output shown (a minimal example, or even some pseudo-code would help), but if you have a code like this:

# file tmp.jl
using MPI

@time begin
  MPI.Init()
   # do computation here
  MPI.finalize()
end

And then run mpirun -np 10 julia ./tmp.jl, then each of the 10 MPI processes will print the results of @time for that process (@time will not do any summing or averaging or anything like that).

The question of what is the “real” time does not have a well defined answer because the processes do not necessarily finish at the same time. You could say the “real” time is the time it takes the longest process to finish. You could also say the “real” time is the length of time from when the first process is created to when the last one finishes (keeping in mind that these might be different processes).

Carol · April 24, 2019, 3:52am

Hi, I write a MPI dot prodect function. For example

function mpi_dot(a,b,n)
    MPI.Init()
    .......
    allsum = MPI.Reduce(local_sum,MPI.SUM, 0, comm)
    MPI.finalize()
end

And I want to test its speed by running @time mpi_dot(a,b,n). Then I can compare it with non-parallel dot product.

But now I have multiple time records, and I don’t know which time I can use to do the comparison.
Can I just the maximum time record?

JaredCrean2 · April 24, 2019, 12:46pm

Because the Reduce function is returning the result only on process 0, I would use the time from process 0.

Carol · May 9, 2019, 11:06pm

Hi, I tried to print the rank id to identify the order of each process, then I can know which result is for process 0.

What surprise me is that the time of process 0 is not the maximum result among all time results.
Could you please tell me why? Then I should still use the time for process 0?

Thanks

JaredCrean2 · May 10, 2019, 1:07am

I’m not sure why. It can depend on the order in which mpirun launches processes, which is not defined by the MPI standard.

I think the time on process 0 is a reasonable choice. The max time would also be a reasonable choice. I’m not sure one is significantly better than the other.

Topic		Replies	Views
Mpi timing Julia at Scale	2	1275	November 19, 2021
Benchmarking MPI programs? Performance benchmarktools , mpi	3	1133	November 19, 2021
Parallel without communication using MPI Julia at Scale	3	748	October 8, 2018
Several questions about time evaluation with @time Performance question , performance	7	129	September 16, 2024
How much is it normal that @time differs in time? New to Julia	20	2362	June 28, 2017

What is the real @time for an MPI program?

Related topics