If pixel27’s proposal works for you, then that is probably the best way to achieve what you want.
Aside: You can avoid creating the task list explicitly by using
julia> @sync @distributed for i in CartesianIndices((2,2,2))
From worker 2: i.I = (1, 1, 1)
From worker 2: i.I = (2, 1, 1)
From worker 3: i.I = (1, 2, 1)
From worker 3: i.I = (2, 2, 1)
From worker 5: i.I = (1, 2, 2)
From worker 5: i.I = (2, 2, 2)
From worker 4: i.I = (1, 1, 2)
From worker 4: i.I = (2, 1, 2)
If this doesn’t work for you, then it would be good if you could share a minimum working example (MWE) which in particular demonstrates how some items are missed. On my end, nested
@distributed loops seem to work unless I put the
@sync in front of them.
Regarding your questions about
@async, etc: I must admit that I am no expert in parallel computing using Julia myself, but here is how I understand it.
When you start Julia with
julia -p p, you create one master process and
p worker processes. Each of these processes (both the master and the workers) keep a list of tasks which they should complete, and they can switch between these tasks if progress on one task depends on input from another task or another process.
@distributed for i = range
p pieces of as-equal-as-possible lengths and then adds to the task lists of each of the worker processes a task of the form
for i = [this process's share of range]
Furthermore, on the master process it creates a task which consists in simply waiting for all the worker processes to complete their tasks, and it returns this task as the result of the
@distributed for loop. This master task does not block progress in the “main” master task, however. If you want to do that, you have to explicitly call
wait() on the
@distributed for task. You can see this playing out in the following example.
julia> master_task = @distributed for i = 1:2
println("Waiting for workers to finish")
println("All workers done")
Waiting for workers to finish
From worker 2: i = 1
From worker 3: i = 2
All workers done
Waiting for workers to finish appears before the output from the workers because the master task does not wait for the workers to finish until we call
@sync is intended to alleviate you of the burden of explicitly keeping track of the
master_task and waiting for it to finish. For example,
@sync @distributed for ...
is equivalent to
master_task = @distributed for ...
I believe that this is all that is needed to understand why the
@sync fails in your example. (It’s possible that I am off, though. Corrections welcome.) In code of the form
@distributed for i = 1:2
@distributed for j = 1:2
@distributed for loop is executed on the worker processes, and hence the
master_tasks associated with the inner for loop live on the worker processes, not the master. If you strap an
@sync around all of this, then this
@sync gets confused about how to properly handle the various tasks on various processors, and this is what leads to the error message.
It is in principle possible to avoid this
@sync confusion, but even then it remains questionable to use nested
@distributed for loops since the outer
@distributed loop simply amounts to parallelising the launching of worker tasks, which is more complicated and likely less performant than it could be.