Nesting `Threads.@thread` and `Polyester.@batch` (or context manager to limit Polyester threads)

Krastanov · July 19, 2022, 8:35pm

How dangerous is it to run Polyester.@batch inside of a Threads.@thread?

An example where I might need this: a library implements a low level function that uses Polyester for multithreading, and I want to run multiple such functions in parallel.

Furthermore, to make this actually useful (i.e., not letting a single Polyester.@batch hogging all the threads), is it possible to control the number of Polyester threads with a context manager of some kind. E.g., how do I implement this pseudocode:

function f(arg)
    println(arg)
    Polyester.@batch for i in 1:10000
        do_something()
    end
end

@thread for j in 1:2
    set_max_polyester_threads(2)
    f(j)
end

I want this code to cause 4 threads to be running in parallel. Two threads dedicated to f(1) and two threads dedicated to f(2). I want this to work in a situation in which I can not modify f itself. Is this possible?

Background: I am using Polyester because benchmarking has shown that cheap threads for the “inner” problem do provide a big performance gain. On the other hand, the “inner” problem can not effective use more than a handful of threads. The outer problem (running the inner problem multiple times) is embarrassingly parallel.

Elrod · July 20, 2022, 2:50am

I had some functionality to support something like that in Polyester before, but ripped it out as it wasn’t documented and I didn’t think anyone was using it, and I didn’t feel like maintaining it at the time.

Something like that could be added again, but wouldn’t it be better to just disable inner threading entirely, focusing on outer threading of your embarrassingly parallel program only?

That’d be easier to do without modifying f, and in fact is something you could already do without modifying the library.

t, r = PolyesterWeave.request_threads(PolyesterWeave.num_threads())
# Polyester, LoopVectorization's threading, and Octavian's threading are now disabled
foreach(PolyesterWeave.free_threads!, r)
# they're now reenabled

Krastanov · July 20, 2022, 12:01pm

Yes, this perfectly addresses my use case. And you are right about just disabling the inner threads. I guess the memory pressure might be a bit worse if I disable the inner threads, but it is not worthwhile for me to profile that yet.

Elrod · July 20, 2022, 1:35pm

Also, because this is definitely relying on internals, we could expose something with some guaranteed stability.
Perhaps a function, disable_polyester_threads(f::F).

Krastanov · July 20, 2022, 1:43pm

I will make a pull request to Polyester with this.

Krastanov · July 20, 2022, 4:05pm

Here is a draft disable_polyester_threads by Krastanov · Pull Request #86 · JuliaSIMD/Polyester.jl · GitHub

Jack_Coughlin · April 5, 2023, 5:52pm

I had some functionality to support something like that in Polyester before, but ripped it out as it wasn’t documented and I didn’t think anyone was using it, and I didn’t feel like maintaining it at the time.

I have been searching for something exactly like this. My “inner problems” are large loops which I’d like to multithread with @tturbo, while the outer task is just a couple of invocations of the inner loop with different arguments. Rather than do, e.g.

@turbo thread=40 <some work>
@turbo thread=40 <some work>

I would love to be able to do

@sync begin
    Threads.@spawn do
        @turbo thread=20 <some work>
    end
    Threads.@spawn do
        @turbo thread=20 <some work>
    end
end

but with the thread counts determined dynamically in the outer task.
It seems like the latter ought to give a better speedup for a range of loop sizes.
It also lets you get around Amdahl’s law better—if you have some piece of work which isn’t as large of a loop but can be computed independently of the other big loops, then you can run it concurrently on just one Polyester thread with the other big loops.

You mentioned you ripped it out of Polyester, so probably your judgment about whether it belonged there was correct, but what would it take to implement the thing I’ve sketched here in my own code?

Elrod · April 5, 2023, 7:07pm

Polyester is more stable now, feel free to make a PR to add something like that back.

Topic		Replies	Views
Another slowdown when using `Threads.@threads`? General Usage performance	11	825	March 24, 2022
Threading, Threads.@threads vs polyester.@batch vs LoopVectorization.@tturbo General Usage multithreading , threads , loopvectorization , polyester	4	1735	July 23, 2022
How can I arrange to only use @threads if the number of iteration is higher than minimum? Performance multithreading	17	1175	September 16, 2021
Questions about Polyester.jl General Usage	7	2349	September 21, 2021
Limit number of threads from Threads.@threads General Usage multithreading	4	598	April 6, 2022

Nesting `Threads.@thread` and `Polyester.@batch` (or context manager to limit Polyester threads)

Related topics