Multithreaded task spreading

Apologies for the simple question. If I keep the number of active Threads.@spawn tasks <= numthreads, will I keep to a more-or-less one task per thread distribution?

I am working on a TaskPool that manages incoming tasks, placing them across multiple threads as previous tasks finish. (In my use case, 1% of the Tasks take 99% of the time, so the default distribution of the Threads.@threads macro doesn’t really cover me, and I’ll eventually need to extend to stack processing anyway.) I’d like to limit the tasks to one per thread (perhaps this is debatable), but there is no Threads.spawnat macro, and the Task scheduler refers only to a cryptic (to me) “multiq”. Specifically, it says

# if multiq is full, give to a random thread (TODO fix).

This is the only thread distribution info I can find - it descends into C after that. Hence the question above.

I’m happy to help in this area if needed, BTW.

The macro @spawnat does indeed exists. See here.
It is not from Threads but from Distributed.

I was under the impression that the Distributed version was explicitly not the right one to use for same-CPU threading? Distributed.@spawn is definitely not, but is it different for @spawnat?.

This is how to use the @spawn macro:

        tasks = [];
        for th in 1:length(threadbuffs)
        	push!(tasks, Threads.@spawn begin 
        		fill!(threadbuffs[th].assembler.F_buffer, 0.0)
        		# now add the restoring force from the subset of the mesh handled by this thread
        		restoringforce(threadbuffs[th].femm, threadbuffs[th].assembler, geom, un1, un, tn, dtn, true)
        	end);
        end
        for th in 1:length(tasks)
        	Threads.wait(tasks[th]);
        	Fn .+= threadbuffs[th].assembler.F_buffer
        end

This scheme appears to rely on the scheduler to do the distribution which is fine, especially if all of the Tasks are equally loaded. The schemes I have running now look similar. But for one of my uses cases specifically, I have two main differences:

  • I need to be able to add incoming Tasks as outgoing ones finish (because each Task can generate new data to be processed)

  • A few of the Tasks are 1000x heavier than 99% of the rest, but there is no way to know ahead of time which they are. So I can’t simply add an outer loop to the above, because if it catches one of the heavy jobs, all threads will stop processing in the inner loop until the big one finishes.

In any case, I have the TaskPool prototype running now, so I’ll just do some testing on it to see how the distributions look. Thanks for the help.

Maybe this is related?

https://github.com/JuliaLang/julia/pull/34185

It is indeed - thanks, I’ll take a look.