Multithreading @threads inside @spawn

Paul_McVay · May 4, 2021, 1:27am

Are threads inside a spawn supposed to find empty threads to use? For example, my function uses @threads to perform 3 10 second sleeps in 10 seconds total. If I then try to spawn 3 of these function calls, ideally I would want them to happen in parrallel and take 10 seconds. However, it takes 30 seconds to run it. Why do spawn and threads not find the empty threads to run it on? I’ve launched julia with 10 threads on a 20 thread machine.

Working example demonstrating

function test(str)
    Threads.@threads for i in 1:3
        sleep(10)
    end
    return "$str done"
end 

t1 = Threads.@spawn test("one")
t2 = Threads.@spawn test("two")
t3 = Threads.@spawn test("three")
fetch(t1)
fetch(t2)
fetch(t3)

If 9 threads were used for each sleep, it would take 10 seconds total instead of the 30 seconds.

julia> Threads.nthreads()
10

[pmcvay@pmcvay ~]$ nproc
20

Paul_McVay · May 4, 2021, 1:31am

Ahh. @threads inside spawn uses only one thread

t1 = Threads.@spawn test("one")
fetch(t1)

This takes 30 seconds.

Any advice on how to accomplish what I am trying to do?

tkf · May 4, 2021, 2:57am

Threads.@threads does not use dynamic scheduling and so does not perform well when iterating over an array of length around nthreads() or less, especially when using calling multiple functions using @threads in parallel.

julia> Threads.nthreads()
10

julia> function threads_sched(nsamples = 1000)
           sched = [[[0], [0], [0]] for _ in 1:nsamples]
           for refs in sched
               Threads.@threads for i in 1:3
                   refs[i][1] = Threads.threadid()
               end
           end
           sched
       end
threads_sched (generic function with 2 methods)

julia> unique(threads_sched())
1-element Vector{Vector{Vector{Int64}}}:
 [[1], [2], [3]]

Furthermore, as you observed, there is no parallelism if you start @threads in a non-primary thread:

julia> Threads.@threads for _ in 1:Threads.nthreads()
           if Threads.threadid() == 2
               global sched = threads_sched()
           end
       end

julia> unique(sched)
1-element Vector{Vector{Vector{Int64}}}:
 [[2], [2], [2]]

We don’t have this problem in @spawn:

julia> function spawn_sched(nsamples = 1000)
           sched = [[[0], [0], [0]] for _ in 1:nsamples]
           for refs in sched
               @sync for i in 1:3
                   Threads.@spawn refs[i][1] = Threads.threadid()
               end
           end
           sched
       end
spawn_sched (generic function with 2 methods)

julia> unique(spawn_sched())
388-element Vector{Vector{Vector{Int64}}}:
 [[7], [1], [2]]
 [[2], [2], [1]]
 [[2], [7], [7]]
 [[4], [7], [2]]
 ...

If you want dynamic scheduling, I recommend avoid using @threads. If you want APIs more high level than @spawn (which is hard to use), there are high-level API packages such as FLoops.jl and Folds.jl that do not have the problem of @threads.

Topic		Replies	Views
Limit number of @spawn'ed threads General Usage	13	2274	June 19, 2024
More threads, slower code, even if not spawning them Performance	19	812	January 29, 2022
@threads vs @spawn New to Julia	9	4505	November 7, 2024
Behavior of `@time` when using `@spawn` (in Julia 1.8 highlights blog post) New to Julia multithreading	2	406	August 22, 2022
Threads.@spawn not using master thread General Usage multithreading	2	779	November 24, 2019

Multithreading @threads inside @spawn

Related topics