I want to define a macro, @parallel, to turn on/off parallelisation in for loops, depending on a command-line argument. Here, parallelisation is embarassing and uses Threads.@threads.
Why would I want to do this? Because I want to fully turn off multi-threading when checking for unwanted allocations. And Threads.@threads has spurious allocations.
Here is a MWE:
# Reading the command-line arguments
using ArgParse
tabargs = ArgParseSettings()
@add_arg_table! tabargs begin
"--parallel"
help = "Parallelisation: true/false"
arg_type = Bool
default = false
end
parsed_args = parse_args(tabargs)
const PARALLEL = parsed_args["parallel"]
#####
if PARALLEL
macro parallel(ex::Expr) # We want to use multiple threads
return :( Threads.@threads $(ex) )
end
else
macro parallel(ex::Expr) # No multi-threading
return :( $(ex) )
end
end
#####
# Using the macro in a function
function run!()
@parallel for i=1:2
sleep(1)
end
end
#####
@time run!()
The code can then be tested in various regimes via julia code.jl --parallel false (Time: 2s) julia code.jl --parallel true (Time: 2s) julia -t 1 code.jl --parallel false (Time: 2s) julia -t 1 code.jl --parallel true (Time: 2s) julia -t 2 code.jl --parallel false (Time: 2s) julia -t 2 code.jl --parallel true (Time: 1s)
Hence, the definition of the macro seems (reasonably) correct. Yet, when used in my real test case, it leads to unexpected errors. I do not get these errors when using directly Threads.@threads. So, @parallel seems badly defined.
What would be the most idiomatic (and correct) way of implementing this macro in julia?
I’m sure you can use this with a proper Channel and switch between sequential and parallel processing via ntasks, but Threads.foreach seems a bit complicated.
A somewhat similar approach is suggested here by oxinabox: use Base.foreach for sequential and ThreadsX.foreach for parallel execution (and automatically switch based on the value of PARALLEL).
@jibe Check out the linked topic: it deals with the same question and has some more answers.
Fantastic! This does get the job done, no errors are left, and I got numerous links/references to mull over
In practice, for the case PARALLEL == false, should I also make the change to
macro parallel(ex::Expr) # No multi-threading
return esc(:( $(ex) ))
end
Thank you very much for your suggestion. Unfortunately, in the case PARALLEL == false, although the code would use a single thread, it would use Threads.foreach and not Base.foreach, hence making unwanted allocations?
Not a newb question – you’re absolutely right that you should do that!
Even if it didn’t affect allocations, always good to const a global if you don’t intend to change it.
I originally wrote this as defining a new function which calls either base or ThreadsX. Then change my mind because that would be an unnecessary pollution of the stack trace, But when I switched to just aliasing the functions, I forgot to add const.
I don’t understand your question here. In the case where PARALLEL is false, it will call Base.foreach, which is serial. Could you clarify what you mean?
There, one always uses Threads.foreach, even for PARALLEL == false. I believe that this would not have been ideal. Indeed, a threaded loop with a single thread is not the same as a “base” loop. Would you agree?
would throw an ERROR: UndefVarError: `sleep_time` not defined, because all variables in ex, in casu sleep_time, have essentially been automatically made local, so that Julia does not recognise them as coming from the surrounding scope.
macro parallel(ex)
return esc(ex)
end
(which is the same as your esc(:($(ex))), but a bit shorter) will again run fine.