Hi,
Just experimenting a bit with parallel execution, using this example here, a straightforward parallel-and-collect results.
The issue here is that it’s not really improving wallclock time, although it’s spending less time in the measured part. Here’re the measurements:
jmerelo@camerelo ~/Code/julia/BraveNewAlgorithm.jl main julia --threads 2 examples/multi_only_crossover.jl
Activating project at `~/Code/julia/BraveNewAlgorithm.jl`
WARNING: using Distances.pairwise in module BraveNewAlgorithm conflicts with an existing identifier.
[ Info: Number of threads -> 2
[ Info: Reading parameters file
0.671818 seconds (15.27 M allocations: 718.511 MiB, 60.20% gc time, 16.01% compilation time)
[ Info: All offspring -> 1000000
jmerelo@camerelo ~/Code/julia/BraveNewAlgorithm.jl main time julia --threads 2 examples/multi_only_crossover.jl
Activating project at `~/Code/julia/BraveNewAlgorithm.jl`
WARNING: using Distances.pairwise in module BraveNewAlgorithm conflicts with an existing identifier.
[ Info: Number of threads -> 2
[ Info: Reading parameters file
0.672166 seconds (15.27 M allocations: 718.561 MiB, 60.24% gc time, 15.95% compilation time)
[ Info: All offspring -> 1000000
julia --threads 2 examples/multi_only_crossover.jl 7,36s user 0,35s system 117% cpu 6,559 total
jmerelo@camerelo ~/Code/julia/BraveNewAlgorithm.jl main time julia --threads 4 examples/multi_only_crossover.jl
Activating project at `~/Code/julia/BraveNewAlgorithm.jl`
WARNING: using Distances.pairwise in module BraveNewAlgorithm conflicts with an existing identifier.
[ Info: Number of threads -> 4
[ Info: Reading parameters file
0.545097 seconds (15.29 M allocations: 718.505 MiB, 63.35% gc time, 51.88% compilation time)
[ Info: All offspring -> 1000000
julia --threads 4 examples/multi_only_crossover.jl 7,69s user 0,35s system 130% cpu 6,178 total
jmerelo@camerelo ~/Code/julia/BraveNewAlgorithm.jl main time julia --threads 8 examples/multi_only_crossover.jl
Activating project at `~/Code/julia/BraveNewAlgorithm.jl`
WARNING: using Distances.pairwise in module BraveNewAlgorithm conflicts with an existing identifier.
[ Info: Number of threads -> 8
[ Info: Reading parameters file
0.448228 seconds (15.29 M allocations: 717.147 MiB, 61.91% gc time, 73.29% compilation time)
[ Info: All offspring -> 1000000
julia --threads 8 examples/multi_only_crossover.jl 8,51s user 0,38s system 148% cpu 5,974 total
So it looks like the small improvement it’s getting is being eaten up by the setting up of the threads, which seem to need circa .2 seconds to set up each. Is that a ballpark estimation that checks out, or am I getting something wrong here?