One trick to get to background-only threads

Currently in the v1.3(.1) threading implementation, tasks are assigned to a random thread, and that thread might be the primary thread. This can be problematic in a couple of use cases, for example a GUI that suddenly slows down because a heavy-computation task just got assigned to the same thread. I’ve found that any attempts at active thread management get impeded by the primary-thread assignments.

I have not found a way to get Threads.@spawn off of the primary, but I did find a way to do it for Threads.@threads (which uses a very different implementation than @spawn).

@threads divvies up the loop it was assigned to by creating a function that handles a portion of the iterables based on the running threadid. The C-level code called takes that function and assigns it once to each thread. This does not provide any real-time management, but it does keep each thread working on exactly one thing at a time.

Using this mechanism, then, all one has to do is change the generated function to not handle any iterables if on the main thread, and handle all the others elsewhere. The following is a modification of the Threads._threadsfor macro function

function _bgthreadsfor(iter,lbody)
    lidx = iter.args[1]         # index
    range = iter.args[2]
    quote
        local threadsfor_fun
        let range = $(esc(range))
        function threadsfor_fun(onethread=false)
            r = range # Load into local variable
            lenr = length(r)
            # divide loop iterations among threads
            if onethread
                tid = 1
                len, rem = lenr, 0
            else
                tid = Threads.threadid()
                tid == 1 && return                        # Mod - Keep execution off of the main thread
                len, rem = divrem(lenr, nthreads()-1)     # Mod - Divide iterables over one less thread 
            end
            # not enough iterations for all the threads?
            if len == 0
                if tid > rem
                    return
                end
                len, rem = 1, 0
            end
            # compute this thread's iterations
            f = 1 + ((tid-2) * len)                       # Mod
            l = f + len-1                                 # Mod
            # distribute remaining iterations evenly
            if rem > 0
                if (tid-1) <= rem                         # Mod
                    f = f + tid-2                         # Mod
                    l = l + tid-1                         # Mod
                else
                    f = f + rem
                    l = l + rem
                end
            end
            # run this thread's iterations
            for i = f:l
                local $(esc(lidx)) = Base.unsafe_getindex(r,i)
                $(esc(lbody))
            end
        end
        end
        if Threads.threadid() != 1
            # only thread 1 can enter/exit _threadedregion
            Base.invokelatest(threadsfor_fun, true)
        else
            ccall(:jl_threading_run, Cvoid, (Any,), threadsfor_fun)
        end
        nothing
    end
end

Then modifying the @threads macro:

macro bgthreads(args...)
    na = length(args)
    if na != 1
        throw(ArgumentError("wrong number of arguments in @threads"))
    end
    ex = args[1]
    if !isa(ex, Expr)
        throw(ArgumentError("need an expression argument to @threads"))
    end
    if ex.head === :for
        return _bgthreadsfor(ex.args[1], ex.args[2])  # Mod
    else
        throw(ArgumentError("unrecognized argument to @threads"))
    end
end

To test this, I bolted on some time-logging to the tasks in my current project. There are 3,421 queries, and total running time is just under 30 seconds. The following shows the number of tasks running on each thread at 1-second intervals, using the normal @threads:

Thread 1: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0
Thread 2: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 3: 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 4: 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 5: 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 6: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 7: 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 8: 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

That same set of queries run under @bgthreads:

Thread 1: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 2: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0
Thread 3: 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 4: 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 5: 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 6: 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 7: 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thread 8: 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Note that Thread 1 is untouched.

Also note, though, that the running time is not reduced. In this use case, 10% of the queries take 90% of the time, and clearly the Thread 2 batch (which used to be the Thread 1 batch) has some of the long ones. So here, @spawn would be better, but it forces a return to Thread 1:

Thread 1: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0
Thread 2: 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0
Thread 3: 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
Thread 4: 3 1 1 1 2 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0
Thread 5: 3 6 6 6 6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0
Thread 6: 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 0 0 0 0 0 0
Thread 7: 2 1 1 1 1 1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0
Thread 8: 1 5 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 0 0 0 0 0 0 0

and is random:

Thread 1: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0
Thread 2: 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
Thread 3: 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
Thread 4: 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 2 1 1 1 1 1 0 0 0 0 0 0 0 0 0
Thread 5: 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0
Thread 6: 3 3 3 3 3 3 3 3 1 1 1 1 1 1 1 1 1 1 1 2 1 0 0 0 0 0 0 0 0 0
Thread 7: 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 5 1 1 1 1 0 0 0 0 0 0 0 0 0
Thread 8: 4 4 4 4 4 4 4 4 4 2 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0

The ideal case would be a Threads.@spawnat, to allow active task assignment managing. Hopefully in the future.

4 Likes

can this be put into a package somewhere?

Sorry for the slow repsonse. Julia has stopped sending me emails for some reason.

In any case - your wish is my command: GitHub - tro3/ThreadPools.jl: Improved thread management for background and nonuniform tasks in Julia. Docs at https://tro3.github.io/ThreadPools.jl :slight_smile:

2 Likes

Thanks. I found it and I have been using it to do something silly

1 Like