The graph above represents execution time per call of the function once in the program below as a function of the number of workers used on a 32 cpu/64 thread machine. The green line (left y axis) represents the program as displayed and the red line (right y axis) the program with the commented line swapped with the line above it. A flat line is good since it means that there is no overhead in having extra workers.
I’m not surprised that the program as presented is faster than the alternative, though I’m surprised by how much faster it is and by the fact that there are next to no gains from using more than 16 processors in the red curve example.
My questions are:
Is this phenomenon specific to Julia, is it specific to the machine architecture, both, or neither?
What other surprises does parallel computing in Julia have in store for me that are not documented in the official Julia documentation?
What would be a good, exhaustive, source for parallel computing with Julia?
using Distributed, BenchmarkTools procs = parse(Int64,ARGS) addprocs(procs; topology=:master_worker) R = parse(Int64,ARGS) @everywhere function once(x::Int64) z = fill(0.0,10) for i = 1:1_000_000 for j = 1:10 z[j] = rand() end #~ z[:] = rand(10) end end @btime pmap(x->once(x), [j for j = 1:R])