Is there a way to enable/disable calls to Threads.@threads?
I’m developing an optimization solver, and when the problem is sufficiently small, using multi-threading hurts performance, and incurs allocations (my code doesn’t make any allocations until I wrap it in @threads). However, when the problem size grows and the individual operations being threaded become more computationally intensive (booting the arithmetic intensity) I get some good performance increases.
I’d like to be able to enable/disable multi-threading as a user-specified option. I can’t just restart Julia with JULIA_NUM_THREADS=1, since @threads still allocates memory and is slower than had I left it out. Can I do this using metaprogramming?
I know I can obviously just create duplicates of my functions and wrap some with @threads, but I’d rather avoid this if possible.
FYI: nearly all of my functions are trivially parallelizable, like this
function foo(vals, vars)
for k in eachindex(vals)
vals[k] = somefunction(vars[k])
end
end
So, to be clear, is there not an easy way to do this at run-time?
For example, I can do this:
function run_kernel(vals,A,b,parallel=true)
if parallel
Threads.@threads for k in eachindex(vals)
vals[k] = mykernel(A,b)
end
else
for k in eachindex(vals)
vals[k] = mykernel(A,b)
end
end
end
Is there not a way to do this without copying the code like that?
This sounds like it’d be better to dispatch on the problem size than nthreads() == 1. If that’s the case, it’s better to use threaded map that supports specifying base case size. It’s kind of a plug, but Transducers.jl has it. It would be something like:
xf = Map() do k
vals[k] = mykernel(A, vars[k])
nothing
end
foldl(right, xf, eachindex(vals)) # sequential
reduce(right, xf, eachindex(vals)) # parallel
reduce(right, xf, eachindex(vals), basesize=10) # parallelize if length(vals) > 10
(I used side-effect in Map which is a bit nasty. It’d be better to use collect and tcollect but they allocates.)
Is there not a way to do this without copying the code like that?
I have an @onthreads macro in ParallelProcessingTools.j that does this for you. It also pins the tasks to threads though, legacy from pre-partr times (I’ll change that in the future, or make it more flexible at least). One of the motivations for @onthreads was easy testing of thread-scaling of code.
The basic Threads.@threads is too simple for that. However if you are a bit more serious about multi-threading, you likely use something anyways.
If you do the chunking yourself (e.g. via ChunkSplitters.jl or OhMyThreads.jl) you can just use a single chunk to effectively disable multi-threading.
I don’t see the need to depending on different packages, you can just create an internal macro to turn on/off the @threads, no? It’s just an if statement inside a macro if that’s all you want.
I’ve done this to switch between parallelism in SymbolicRegression, including distributed mode:
Which you can then use like
out = @sr_spawner(
do_stuff(),
parallelism=:multithreading, #runtime value
worker_idx=i, #worker index (or unused)
)
Pretty simple and gets the job done.
This is for manual spawning tasks but you could do the same with the loop-wrapping macro.
@abraemer I am using ChunkSplitters.jl and while I agree that using nchunks=1 is practically setting it to single threaded, then unfortunately it does not deactivate @threads. I am developing an algorithm and make to sure my code does not allocate, but can also be run in parallel in needed, which is why I need such an option for @threads.
@MilesCranmer I think I will investigate what you mention, I should be able to make a copy of @threads taking in an extra argument like execute which can be eitehr true or false - this seems quite valid?