Is this parallel performance correct?

anon94023334 · June 20, 2017, 1:51pm

I saw some strange performance today and I’m wondering what I’m doing wrong here.

julia> using BenchmarkTools

julia> a = [1:1000000;];

julia> @btime map(log, a);
  13.449 ms (3 allocations: 7.63 MiB)

julia> addprocs(4)
4-element Array{Int64,1}:
 2
 3
 4
 5

julia> wp=CachingPool(workers())
CachingPool(Channel{Int64}(sz_max:9223372036854775807,sz_curr:4), Set([4, 2, 3, 5]), Dict{Tuple{Int64,Function},RemoteChannel}())

julia> @btime pmap(wp, log, a);
  133.079 s (119748792 allocations: 3.38 GiB)

I did notice that the bottleneck seemed to be the master process (it was at 90+% CPU consistently while the workers were between 30% and 40%), but I was surprised at how much slower the pmap code was. Any ideas? If I had to guess, it’s because the computation is small, and this is the result of lots of data movement between nodes, but it’d be nice to have someone confirm this.

adamslc · June 20, 2017, 3:07pm

You might want to use the batch_size keyword argument for pmap. It doesn’t completely remove the cost of communications, but it reduces them quite a bit. For example:

julia> using BenchmarkTools

julia> a = [1:1000000;];

julia> @btime map(log, a);
  17.034 ms (3 allocations: 7.63 MiB)

julia> addprocs(4)
4-element Array{Int64,1}:
 2
 3
 4
 5

julia> @btime pmap(log, a);
  67.329 s (93247393 allocations: 2.65 GiB)

julia> @btime pmap(log, a, batch_size=10000);
  3.189 s (7012233 allocations: 185.70 MiB)

Topic		Replies	Views
Weird behavior of pmap General Usage	5	1446	July 2, 2019
Pmap slow compared to map General Usage performance , parallel	11	3090	September 25, 2018
Struggling with pmap New to Julia parallel	8	1050	September 5, 2019
Pmap performance regression: pmap(x->f(x,y), X) creates copies of y Performance	9	1198	August 30, 2018
Why is the parallel map so slow? General Usage parallel , optimization , pmap	2	3272	May 10, 2020

Is this parallel performance correct?

Related topics