Multithreading balancing

Joris_Pinkse · August 28, 2021, 2:32pm

Consider the following scenario (in local scope):

function ....
    @threads for i ∈ 1:128
         y[i] = dosomethingexpensive( x[i] )
    end
end

Say I’m running this on a machine with 32 physical cores. Then often, the last n < 32 calls to dosomethingexpensive are completed by fewer than n threads.

What would be the best way to achieve greater balancing?

Background: an example would be the case in which each i corresponds to a replication in a simulation study, where each replication can take a few minutes, but where there is no ex ante expectation that one replication would take longer than another.

Sukera · August 28, 2021, 2:41pm

Currently, only direct segmentation is supported by the @threads macro. That is, there is no work stealing API directly available yet - you’ll have to do balancing yourself. One way “around” that is to use a Channel of tasks, which are created ahead of time and pushed into that channel. After all tasks are created, take! from the channel on all threads and execute the given task, thereby emulating a work-stealing scheduler.

You can also take a look at Threadpools.jl, though that comes with some caveats due to the fact that julia doesn’t pin threads to certain CPU threads etc.

carstenbauer · August 28, 2021, 2:43pm

FWIW, Threads.@spawn does load balancing (similar to pmap from Distributed).

Joris_Pinkse · August 28, 2021, 2:46pm

Thanks.

I’ve played with something like

@sync for ....
     @spawn ...
end

but I recall reading on this forum that @threads is preferable for load balancing reasons… I’ll play around some more.

carstenbauer · August 28, 2021, 3:10pm

@threads has lower overhead (“is cheaper”) but doesn’t do load balancing at all. The iteration range of the loop is split into equals parts according to the number of available threads. OTOH, @spawn implements a form of load balancing but has more overhead. See Announcing composable multi-threaded parallelism in Julia.

Sukera · August 28, 2021, 3:27pm

Yes, using @spawn is basically the same as managing the tasks explicitly by hand via a Channel. In the case of @spawn, it’s the julia task system that’s doing the “balancing” for you implicitly.

As of julia 1.5, @threads has an argument schedule, though currently only :static (“which creates one task per thread and divides the iterations equally among them”) is supported. In the future, when more kinds of scheduling would be supported, @threads may be the better option (though I’m not sure what the current direction of things in that regard are).

tkf · August 28, 2021, 10:13pm

I’ve implemented load-balancing threaded parallel loops in FLoops.jl which can also use a wide class of scheduling policies depending on your needs (plus other things like distributed and GPU -based parallel loops and reductions).

Topic		Replies	Views
@threads vs @spawn New to Julia	9	4742	November 7, 2024
Multi-threading appears to be single thread when some threads cost much more time than the others? General Usage question	13	722	July 10, 2023
Multithreaded task spreading New to Julia multithreading	6	749	January 5, 2020
Looking for code to solve (surely common) 'embarrassing parallelism' multithreading use-case Performance parallel , multithreading	16	1578	January 20, 2021
Multithreading @threads inside @spawn Performance multithreading	2	748	May 4, 2021

Multithreading balancing

Related topics