Currently in the v1.3(.1) threading implementation, tasks are assigned to a random thread, and that thread might be the primary thread. This can be problematic in a couple of use cases, for example a GUI that suddenly slows down because a heavy-computation task just got assigned to the same thread. I’ve found that any attempts at active thread management get impeded by the primary-thread assignments.
I have not found a way to get Threads.@spawn
off of the primary, but I did find a way to do it for Threads.@threads
(which uses a very different implementation than @spawn
).
@threads
divvies up the loop it was assigned to by creating a function that handles a portion of the iterables based on the running threadid. The C-level code called takes that function and assigns it once to each thread. This does not provide any real-time management, but it does keep each thread working on exactly one thing at a time.
Using this mechanism, then, all one has to do is change the generated function to not handle any iterables if on the main thread, and handle all the others elsewhere. The following is a modification of the Threads._threadsfor
macro function
function _bgthreadsfor(iter,lbody)
lidx = iter.args[1] # index
range = iter.args[2]
quote
local threadsfor_fun
let range = $(esc(range))
function threadsfor_fun(onethread=false)
r = range # Load into local variable
lenr = length(r)
# divide loop iterations among threads
if onethread
tid = 1
len, rem = lenr, 0
else
tid = Threads.threadid()
tid == 1 && return # Mod - Keep execution off of the main thread
len, rem = divrem(lenr, nthreads()-1) # Mod - Divide iterables over one less thread
end
# not enough iterations for all the threads?
if len == 0
if tid > rem
return
end
len, rem = 1, 0
end
# compute this thread's iterations
f = 1 + ((tid-2) * len) # Mod
l = f + len-1 # Mod
# distribute remaining iterations evenly
if rem > 0
if (tid-1) <= rem # Mod
f = f + tid-2 # Mod
l = l + tid-1 # Mod
else
f = f + rem
l = l + rem
end
end
# run this thread's iterations
for i = f:l
local $(esc(lidx)) = Base.unsafe_getindex(r,i)
$(esc(lbody))
end
end
end
if Threads.threadid() != 1
# only thread 1 can enter/exit _threadedregion
Base.invokelatest(threadsfor_fun, true)
else
ccall(:jl_threading_run, Cvoid, (Any,), threadsfor_fun)
end
nothing
end
end
Then modifying the @threads macro:
macro bgthreads(args...)
na = length(args)
if na != 1
throw(ArgumentError("wrong number of arguments in @threads"))
end
ex = args[1]
if !isa(ex, Expr)
throw(ArgumentError("need an expression argument to @threads"))
end
if ex.head === :for
return _bgthreadsfor(ex.args[1], ex.args[2]) # Mod
else
throw(ArgumentError("unrecognized argument to @threads"))
end
end
To test this, I bolted on some time-logging to the tasks in my current project. There are 3,421 queries, and total running time is just under 30 seconds. The following shows the number of tasks running on each thread at 1-second intervals, using the normal @threads
:
Thread 1: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0
Thread 2: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 3: 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 4: 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 5: 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 6: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 7: 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 8: 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
That same set of queries run under @bgthreads
:
Thread 1: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 2: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0
Thread 3: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 4: 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 5: 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 6: 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 7: 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 8: 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Note that Thread 1 is untouched.
Also note, though, that the running time is not reduced. In this use case, 10% of the queries take 90% of the time, and clearly the Thread 2 batch (which used to be the Thread 1 batch) has some of the long ones. So here, @spawn
would be better, but it forces a return to Thread 1:
Thread 1: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0
Thread 2: 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0
Thread 3: 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
Thread 4: 3 1 1 1 2 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
Thread 5: 3 6 6 6 6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0
Thread 6: 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 0 0 0 0 0 0
Thread 7: 2 1 1 1 1 1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0
Thread 8: 1 5 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 0 0 0 0 0 0 0
and is random:
Thread 1: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0
Thread 2: 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
Thread 3: 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
Thread 4: 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 2 1 1 1 1 1 0 0 0 0 0 0 0 0 0
Thread 5: 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
Thread 6: 3 3 3 3 3 3 3 3 1 1 1 1 1 1 1 1 1 1 1 2 1 0 0 0 0 0 0 0 0 0
Thread 7: 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 5 1 1 1 1 0 0 0 0 0 0 0 0 0
Thread 8: 4 4 4 4 4 4 4 4 4 2 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0
The ideal case would be a Threads.@spawnat
, to allow active task assignment managing. Hopefully in the future.