I would like to experiment with parallelizing an embarrassingly parallel computation on an iterable using the new multithreading in 1.3. I need
I would prefer to leave return type determination to Julia, so I would prefer not to preallocate.
Do I just need to
@spawn and then
fetch, as in
using Base.Threads: @spawn, fetch, threadid, nthreads
function ploop(f, itr)
map(fetch, map(i -> @spawn(f(i)), itr))
function pmapreduce(f, op, itr)
mapreduce(fetch, op, map(i -> @spawn(f(i)), itr))
f(i) = (@show threadid(); sleep(rand()); Float64(i))
pmapreduce(f, +, 1:10)
(The motivation for this approach is that in practice
f itself can use threads and the whole things hopefully composes neatly.)
mapreduce it might make sense to split up the list recursively, so that
op can also run in parallel.
mapreduce should include a lower threshold to activate. that threshold is related to the proportion of calling a function
f and spawning a thread. for example, if
f(x) = x+1, then the cost of spawning a thread is aproximately 5000 times more expensive (obviusly this is the worst case, but using a proportion between the time to do one operation and the time to spawn a thread can give a good approximation.
Spawning 1 thread per op seems excessive, but that strategy pays off if your operation is expensive enough to overcome the thread spawning overhead (by how much? 10 times?)
Obviously — the actual problem I am working on takes about 2–10 minutes for a single call, and the
sleep in the MWE is a stand-in for this.
Then ThreadTools does exactly what you want.
ThreadTools.jl is currently implemented with somewhat expensive computations in mind. There is a benchmark in the Readme that indicate roughly where the overhead starts becoming excessively expensive.