Maybe I don’t understand how @spawn
works, but here is what I am observing: When executing a parallel task with it, I observe that the performance is significantly worse if I have more threads available (i. e. starting julia with -tX
with greater X
) even if I am not spawning more threads.
An example code is bellow, but essentially what I see is:
- Start julia with
julia -t32
and execute the code spawning32
threads: 216 μs - Start julia with
julia -t64
and execute the code spawning32
threads: 3.323 ms
How to understand this? The result of that is that in a parallel code, even if I know that it is not worth to spawn more threads because of the limited scalability, the performance gets worse just for the fact that more threads are available.
(FWIW, this is in a computer with 128 real cores)
Test code:
using Base.Threads: @spawn
using LinearAlgebra: norm
using StaticArrays
using Test
# Parallel code for sum(norm.( x - y for x in x, y in y ))
function sumd(x,y,nbatches;aux=zeros(nbatches))
@assert length(x)%nbatches == 0
batchsize = length(x) ÷ nbatches
aux .= 0.
@sync for ibatch in 1:nbatches
ifirst = (ibatch-1)*batchsize + 1
ilast = ibatch*batchsize
@spawn begin
for i in ifirst:ilast
for j in 1:length(y)
@inbounds aux[ibatch] += norm(y[j]-x[i])
end
end
end
end
return sum(aux)
end
function test(nbatches)
x = [ rand(SVector{3,Float64}) for _ in 1:6400 ]
y = [ rand(SVector{3,Float64}) for _ in 1:10 ]
@test sumd(x,y,nbatches) ≈ sum(norm.( x - y for x in x, y in y ))
aux = zeros(nbatches)
@btime sumd($x,$y,$nbatches;aux=$aux)
end
Here is the step-by-step:
Start Julia with 32
threads, and spawn 32 tasks:
[lmartine@adano58 old]$ JULIA_EXCLUSIVE=1 julia -t32
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.7.0 (2021-11-30)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> include("./test.jl")
test (generic function with 1 method)
julia> test(32)
216.054 μs (205 allocations: 17.83 KiB)
42508.465422704496
Start Julia with 64 threads, and spawn again 32 tasks:
[lmartine@adano58 old]$ JULIA_EXCLUSIVE=1 julia -t64
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.7.0 (2021-11-30)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> include("./test.jl")
test (generic function with 1 method)
julia> test(32)
3.323 ms (205 allocations: 17.83 KiB)
40862.74149814549