Multithreaded task spreading

tro3 · January 4, 2020, 3:48pm

Apologies for the simple question. If I keep the number of active Threads.@spawn tasks <= numthreads, will I keep to a more-or-less one task per thread distribution?

I am working on a TaskPool that manages incoming tasks, placing them across multiple threads as previous tasks finish. (In my use case, 1% of the Tasks take 99% of the time, so the default distribution of the Threads.@threads macro doesn’t really cover me, and I’ll eventually need to extend to stack processing anyway.) I’d like to limit the tasks to one per thread (perhaps this is debatable), but there is no Threads.spawnat macro, and the Task scheduler refers only to a cryptic (to me) “multiq”. Specifically, it says

# if multiq is full, give to a random thread (TODO fix).

This is the only thread distribution info I can find - it descends into C after that. Hence the question above.

I’m happy to help in this area if needed, BTW.

cshen · January 4, 2020, 4:28pm

The macro @spawnat does indeed exists. See here.
It is not from Threads but from Distributed.

tro3 · January 4, 2020, 4:30pm

I was under the impression that the Distributed version was explicitly not the right one to use for same-CPU threading? Distributed.@spawn is definitely not, but is it different for @spawnat?.

PetrKryslUCSD · January 4, 2020, 5:58pm

This is how to use the @spawn macro:

        tasks = [];
        for th in 1:length(threadbuffs)
        	push!(tasks, Threads.@spawn begin 
        		fill!(threadbuffs[th].assembler.F_buffer, 0.0)
        		# now add the restoring force from the subset of the mesh handled by this thread
        		restoringforce(threadbuffs[th].femm, threadbuffs[th].assembler, geom, un1, un, tn, dtn, true)
        	end);
        end
        for th in 1:length(tasks)
        	Threads.wait(tasks[th]);
        	Fn .+= threadbuffs[th].assembler.F_buffer
        end

tro3 · January 4, 2020, 8:29pm

This scheme appears to rely on the scheduler to do the distribution which is fine, especially if all of the Tasks are equally loaded. The schemes I have running now look similar. But for one of my uses cases specifically, I have two main differences:

I need to be able to add incoming Tasks as outgoing ones finish (because each Task can generate new data to be processed)
A few of the Tasks are 1000x heavier than 99% of the rest, but there is no way to know ahead of time which they are. So I can’t simply add an outer loop to the above, because if it catches one of the heavy jobs, all threads will stop processing in the inner loop until the big one finishes.

In any case, I have the TaskPool prototype running now, so I’ll just do some testing on it to see how the distributions look. Thanks for the help.

tkf · January 5, 2020, 12:39am

Maybe this is related?

https://github.com/JuliaLang/julia/pull/34185

tro3 · January 5, 2020, 12:53am

It is indeed - thanks, I’ll take a look.

Topic		Replies	Views
Limit number of @spawn'ed threads General Usage	13	2295	June 19, 2024
@threads vs @spawn New to Julia	9	4712	November 7, 2024
Difference b/w @threads and @spawn macros General Usage question , parallel	4	391	January 17, 2023
Threads.@threads and Threads.@spawn General Usage parallel , multithreading	4	3592	November 26, 2019
Multithreading @threads inside @spawn Performance multithreading	2	747	May 4, 2021

Multithreaded task spreading

Related topics