I am trying to understand how different workers can communicate in Julia.
In the documentation of previous versions of Julia (<= v0.6) in the Parallel Computing section, a simple pmap function was given as an example. This simple pmap is (adapted to work in julia >= 1.0):
function simplepmap(fun, collection)
n = length(collection)
results = Vector{Any}(undef, n)
i = 1
# function to produce the next work item from the queue.
# in this case it's just an index.
nextidx() = (idx=i; i+=1; idx)
@sync for worker in workers()
@async begin
while true
idx = nextidx()
if idx > n
break
end
results[idx] = remotecall_fetch(fun, worker, collection[idx])
end
end
end
return results
end
In this code the role of the function nextidx
is to increment the variable idx
, thus moving to the next element of collection
.
However, if we try to write the code as:
function simplepmap_wrong(fun, collection)
n = length(collection)
results = Vector{Any}(undef, n)
idx = 0
@sync for worker in workers()
@async begin
while true
idx += 1
if idx > n
break
end
results[idx] = remotecall_fetch(fun, worker, collection[idx])
end
end
end
return results
end
This code fails.
For example if we try:
julia> addprocs(4)
julia> @everywhere f(x)=2*x
julia> xs = collect(1:10)
julia> simplepmap(f, xs) # works
julia> simplepmap_wrong(f, xs) # This Fails!: BoundsError: attempt to access 10-element Array{Any,1} at index [20]
However, in simplepmap_wrong
, the line idx += 1
increases idx
by one, the same thing that is performed by idx = nextidx()
in simplepmap
.
Why is there this difference in behaviour?
Thank you for your help!
(P.S.: I also noticed that the documentation for versions >1.0 still has a paragraph that explains the simple pmap function, even thought the code is no longer shown. Is this a relic from past versions of the documentation that should be corrected?)