[ANN] ThreadPools.jl - Improved thread management for background and nonuniform tasks

tro3 · January 20, 2020, 3:49pm

I am happy to announce the registration of ThreadPools.jl.

As of Julia 1.3, the algorithm of the @threads macro is to pre-divide the incoming range into equal partitions running on each thread. This is very efficient for most use cases, but there are a couple of scenarios where this is a problem:

When something else is running on the primary thread, like a GUI or a web server
When the tasks are very nonuniform - this can lead to one thread working on the longer tasks while the others sit idle

ThreadPools.jl exposes @fgthreads and @bgthreads, both of which actively manage the thread pool so that when one task finishes, the next one in the queue starts. If one thread is hit with a long task, the others will keep processing the shorter tasks and if a second long one comes along, it will be processed in parallel with the first long one.

I had to produce this package for myself, as I hit both of the above pain points at the same time in my use case. This community has been very helpful, so I figured I’d fill out the corners, document it, and see if you all find it useful. If so, I am committed to keeping it maintained for as long as needed. This is my first registered package, so if there is anything I missed in making this more useful, don’t hesitate to let me know.

Usage

Basic usage is pretty easy. Just replace @threads or map with @fgthreads/@bgthreads or fgmap/bgmap (bg versions are background, keeping the primary thread free):

julia> @bgthreads for x in 1:3
         println("$x $(Threads.threadid())")
       end
2 3
3 4
1 2

julia> bgmap([1,2,3]) do x
         println("$x $(Threads.threadid())")
         x^2
       end
2 3
3 4
1 2
3-element Array{Int64,1}:
 1
 4
 9

Logging

I found this very useful. There are logging versions of all the functions to help tune performance:

julia> ThreadPools.logfgforeach(x -> sleep(0.1*x), "log.txt", 1:8)

julia> ThreadPools.showstats("log.txt")

    Total duration: 1.217 s
    Number of jobs: 8
    Average job duration: 0.46 s
    Minimum job duration: 0.112 s
    Maximum job duration: 0.805 s

    Thread 1: Duration 1.217 s, Gap time 0.0 s
    Thread 2: Duration 0.827 s, Gap time 0.0 s
    Thread 3: Duration 0.613 s, Gap time 0.0 s
    Thread 4: Duration 1.023 s, Gap time 0.0 s

julia> ThreadPools.showactivity("log.txt", 0.1)
0.000   -   -   -   -
0.100   4   2   1   3
0.200   4   2   5   3
0.300   4   6   5   3
0.400   4   6   5   7
0.500   8   6   5   7
0.600   8   6   5   7
0.700   8   6   -   7
0.800   8   6   -   7
0.900   8   -   -   7
1.000   8   -   -   7
1.100   8   -   -   -
1.200   8   -   -   -
1.300   -   -   -   -
1.400   -   -   -   -

There is also a logging version of the original @threads, so you can measure performance of your use case with the built-in macro.

Structure

The ThreadPool is simply a Channel of Tasks that feeds to a number of Task handlers on each thread. Each handler processes the task and hands back to an output Channel. Then there is a result iterator to fetch the results:

put!(pool, fn, args...)
      |
      V
Channel{Task}
      |
      |----------------------------   ...
      |                   |
      V                   V
Thread 1 Handler   Thread 2 Handler   ...
  (optional)              |
      |                   |
      |<---------------------------   ...
      | 
      V
Channel{Task}
      |
      V
results(pool)

Pretty straightforward stuff - the channels serve as a queue of tasks to be completed, and the handlers just process them as they become available. I attempted to wrap it into an even simpler API so the under-the-hood stuff doesn’t have to be dealt with.

Anyhow, thanks for your help, and all feedback appreciated. Cheers!

tobias.knopp · January 20, 2020, 3:58pm

Thanks for working on this. Does this only work when you start Julia with at least 2 threads? Does it error or deadlock in case of one thread?

tro3 · January 20, 2020, 4:25pm

The simple API functions (@fgthreads, fgmap, @bgthreads, bgmap) all fall back gracefully in the single-thread case. (Which is a pain for testing, BTW - two differing Julia sessions needed.) Using the underlying ThreadPool structure or the logging versions will all throw an error. I was thinking of adding the gracefull fallbacks to the logging cases, if they seem useful.

JeffreySarnoff · January 20, 2020, 4:28pm

Please help me fill in what I have missed. It seems that you provide two groups of threads, the “foreground threads” and the “background threads” and one “foreground thread” is distinguished as the “primary thread”. Is the primary thread always the thread with id=1? Is the primary thread task assignable (e.g. run the UI)? How does one put additional tasks into “foreground threads” and other tasks into “background threads”?

tro3 · January 20, 2020, 4:31pm

There is no explicit thread assignment capability here. (Julia as of now makes that a little difficult, but I think possible.) This just creates a pool of task handlers that manages themselves - that pool may involve the primary thread (fg) or not (bg). The fg versions dont force operation in the foregoround, they just allow it.

The primary thread is indeed id==1, and is the thread that Julia starts on. So any GUI or server async Task will be there. This may change in the future, of course, since the threading thing is new.

Does that help?

JeffreySarnoff · January 20, 2020, 4:43pm

Do the bg versions disallow operation in the foreground?

tro3 · January 20, 2020, 4:45pm

Yes. That way if you have say a server running on the main thread, it won’t go unresponsive if a heavy (ie blocking) task is submitted. (This was one of the issues I had with @threads)

louisponet · January 20, 2020, 5:02pm

Could you elaborate a little more on what the main difference with respect to the functionality of the ‘built in’ Julia scheduler is? Couldn’t a similar thing be achieved with the @spawn macro inside a loop?

Is the main benefit the possibility of subdividing into fg and bg threads?

Sukera · January 20, 2020, 5:10pm

@spawn potentially spawns the task on the main thread - the @bgthreads macro from this package prevents that.

help?> Threads.@spawn
  Threads.@spawn expr

  Create and run a Task on any available thread. [...]

dmolina · January 20, 2020, 5:11pm

Thanks you, it seems very nice and useful!

One question, the difference between fgthreads and bgthreads is that the current/first thread in the second one is not used, didn’t it?

tro3 · January 20, 2020, 5:11pm

Yes.

dmolina · January 20, 2020, 5:13pm

Ok, then it is a very intuitive concept.
Thanks you again, it is very useful when the time used for each thread could be difference.

louisponet · January 20, 2020, 5:14pm

Okay yea that’s what I meant with my second sentence, that’s nice though very useful for making sure things don’t run on the primary thread!

tro3 · January 20, 2020, 5:18pm

@Sukera is correct about @spawn. There are actually two schedulers running around Julia currently. The @spawn macro sends the task to the C-level scheduler, which is explicitly random and includes the primary. The @threads macro divvies up the work before scheduling, then schedules a chunk on each thread (again including the primary). Both schedule their work before it is known how long that work will take.

ThreadPools will only take on work once the previous jobs have finished. If every job is exactly the same size, the overhead is not worth it - stick with @threads. But if there are a few jobs way longer than the rest, I’ve found the above queuing strategy saves time (about 30% faster in my use case).

tro3 · January 20, 2020, 5:28pm

There are actually two axes that are different from Base here, and they do solve different problems:

The scheduling strategy helps in the nonuniform job case
Prevention of the primary thread usage helps in the “stay in the backgound” case

It occurs that one thing I did not do, but should, is include a macro for the “I have very short, uniform computations, but I need the primary thread free” scenario. For that, an alternate @threads that uses the pre-division strategy but prevents scheduling to the primary would be most efficient. Easy enough to add while I have my head wrapped around the details, but I’m not sure what to call it to differentiate it clearly from the @bgthreads macro.

JeffreySarnoff · January 20, 2020, 5:43pm

@npthreads where np is non primary (pun intended)

tro3 · January 20, 2020, 5:48pm

I like it. I’ll add it for v0.3.0, and I’ll upgrade the documentation to show the decision matrix, something like:

                             Uniform jobs         Non Uniform Jobs 
                          -----------------------------------------
Primary thread usable     |  @threads               @fgthreads
                          |
Primary thread not usable |  @npthreads             @bgthreads

tro3 · January 20, 2020, 5:58pm

Alternately, if we separate the queueing and background axes, we could have:

                             Uniform jobs         Non Uniform Jobs 
                          -----------------------------------------
Primary thread usable     |  @threads               @qthreads
                          |
Primary thread not usable |  @bgthreads             @bgqthreads

the main problem - it breaks the API already. But I think it is a clearer API than what I have now once the non-primary pre-division option is added. Ah, 20/20 hindsight…

Sukera · January 20, 2020, 7:15pm

You’re in v0.2 of the package and announced it just 3 hours ago - don’t worry about it, at this point a clearer API is welcome

JeffreySarnoff · January 20, 2020, 7:51pm

Yes, the 0.low_digit versions often are used while getting things right.
You can modify or restructure an as-yet-unsettled API, you cannot break it.

a minor suggestion: use the q as a prefix rather than as an infix

renames @bgqthreads to @qbgthreads
– this makes the table simpler to follow (imo)