Are threads inside a spawn supposed to find empty threads to use? For example, my function uses @threads to perform 3 10 second sleeps in 10 seconds total. If I then try to spawn 3 of these function calls, ideally I would want them to happen in parrallel and take 10 seconds. However, it takes 30 seconds to run it. Why do spawn and threads not find the empty threads to run it on? I’ve launched julia with 10 threads on a 20 thread machine.
Working example demonstrating
function test(str)
Threads.@threads for i in 1:3
sleep(10)
end
return "$str done"
end
t1 = Threads.@spawn test("one")
t2 = Threads.@spawn test("two")
t3 = Threads.@spawn test("three")
fetch(t1)
fetch(t2)
fetch(t3)
If 9 threads were used for each sleep, it would take 10 seconds total instead of the 30 seconds.
Threads.@threads does not use dynamic scheduling and so does not perform well when iterating over an array of length around nthreads() or less, especially when using calling multiple functions using @threads in parallel.
julia> Threads.nthreads()
10
julia> function threads_sched(nsamples = 1000)
sched = [[[0], [0], [0]] for _ in 1:nsamples]
for refs in sched
Threads.@threads for i in 1:3
refs[i][1] = Threads.threadid()
end
end
sched
end
threads_sched (generic function with 2 methods)
julia> unique(threads_sched())
1-element Vector{Vector{Vector{Int64}}}:
[[1], [2], [3]]
Furthermore, as you observed, there is no parallelism if you start @threads in a non-primary thread:
julia> Threads.@threads for _ in 1:Threads.nthreads()
if Threads.threadid() == 2
global sched = threads_sched()
end
end
julia> unique(sched)
1-element Vector{Vector{Vector{Int64}}}:
[[2], [2], [2]]
We don’t have this problem in @spawn:
julia> function spawn_sched(nsamples = 1000)
sched = [[[0], [0], [0]] for _ in 1:nsamples]
for refs in sched
@sync for i in 1:3
Threads.@spawn refs[i][1] = Threads.threadid()
end
end
sched
end
spawn_sched (generic function with 2 methods)
julia> unique(spawn_sched())
388-element Vector{Vector{Vector{Int64}}}:
[[7], [1], [2]]
[[2], [2], [1]]
[[2], [7], [7]]
[[4], [7], [2]]
...
If you want dynamic scheduling, I recommend avoid using @threads. If you want APIs more high level than @spawn (which is hard to use), there are high-level API packages such as FLoops.jl and Folds.jl that do not have the problem of @threads.