Let’s say i have started julia with
JULIA_NUM_THREADS=4, however in one of my loop I only want to use 2 threads because it gives better performance. How do I do that so that
Threads.@threads only use 2 threads?
Is there maybe any other libraries that give control over the number of threads that can be used? I looked at
Floops but couldn’t find anything relevant to my problem.
You can use what ever threading solution you want. (Including
Base.Semaphore to control number of threads active at a time.
using Threads sem = Base.Semaphore(2) # at most 2 at a time Threads.@threads for ii in 1:100 Base.acquire(sem) println(threadid()) Base.release(sem) end
This will ensure only 2 threads are doing anything at a time.
THough which two may change (and in the case of
Threads.@thread will because of how it allocates work in advance).
It may not lead to optimal scheduling however.
ThreadPools exposes multiple different scheduling algorithms and they can take a pool argument of a subset of threads which you can construct in advance
using ThreadPools pool = ThreadPools.StaticPool(3, 2) # use only 2 threads, starting from thread 3 @tthreads pool for ii in 1:100 println(threadid()) end
This actually will ensure only thread #3 and #4 are used.
For performance, a better way to limit tasks to be spawned is to specify the base case size rather than hard-coding the number of tasks. In JuliaFolds, you can specify this via
basesize parameter in various APIs. This is especially true in library code.
This is a better approach since it works well even when the input size is changed. For an input smaller than
basesize, you’d get a single-threaded program and with no
Task spawn overhead. For a very large input, it’ll use additional CPUs as needed.
basesize parameter can also be used to “simulate” different
JULIA_NUM_THREADS without restarting Julia: Frequently asked questions
This is interesting but correct me if I am wrong, what you describe only works with already parallelised basic functions, @oxinabox solution is more general, correct?
These “already parallelized basic functions” are just normal Julia functions. So, you can do the same for your hand-rolled function. For example, if you can rephrase your program as a Divide-and-conquer algorithm, it’s often straightforward to use this strategy. Even if this does not work, as I said in Async limit - #3 by tkf, I’d recommend using the “worker pool” pattern rather than semaphore. For more specific comments, I think it’d be helpful if you can provide an MWE.