The use of Iterators.partition
is not ideal for this task, because it may partition in uneven number of tasks per chunk, for example:
julia> length.(Iterators.partition(1:10,4))
3-element Vector{Int64}:
4
4
2
which is why I wrote the (simple but convenient) ChunkSplitters
package, which will result in:
julia> using ChunkSplitters
julia> length.(map(first,chunks(1:10,3)))
3-element Vector{Int64}:
4
3
3
Then, I think that if you do not want to spawn many tasks, I prefer the following pattern:
using ChunkSplitters
function solve(solvers, inputs; number_of_chunks=length(solvers))
@threads for (i_range, i_chunk) in chunks(inputs, number_of_chunks)
for i in i_range
solver = solvers[i_chunk]
input = inputs[i]
...
end
end
# reduce results
return ...
end
Which has two good properties: 1) It only spawns the number of tasks associated to the number of chunks desired; 2) it has essentially no overhead relative to a simple @threads
-ed loop. (You could do the same with @sync
and @spawn
, but there would be no advantage here. To improve load balancing you can simply increase the number of chunks (or use the :scatter
chunking option if there is a correlation between the index of the task and its cost).
If you don’t mind spawning many tasks, you can use channels directly, and there is no need for chunking:
function solve(solvers, input)
@sync for task in inputs
@spawn begin
solver = take!(solvers)
...
put!(solvers, solver)
end
end
# reduce
return ...
end
You don’t need chunking because the channels are blocking, the spawned tasks will only run when a channel is available. That is better if the tasks are very uneven and if the time required for spawning tasks is not important relative to the time of each task.
ps: Yet, I think there is something about channels that I don’t get exactly right… When I try experimenting with them I often reach locked states and I don’t understand exactly why… It seems that I can get to a stalled take!
call with the pattern above. This was the issue: When using channels, errors do not get raised, and computation stalls.