Did you check if the two versions return the same result? For me, mysum_dist2
returns zero.
When you create a distributed array, the whole data is distributed to the worker process, which is why
julia> localpart(adist)
0-element Array{Float64,1}
However, using the @spawnat
macro you can run a command on a certain worker, so
julia> r = @spawnat 2 localpart(adist)
Future(2, 1, 115, nothing)
will run the localpart
function on the worker with id 2. However, the result of the function only exists on the worker and only after the function has completed, so we can call fetch
which will wait for the operation to finish and then copies the data from the worker back to the main process.
julia> fetch(r)
2500000-element Array{Float64,1}:
0.2612222333532057
0.9666371698145217
⋮
0.6747917592939274
which is the part of the distributed array that belongs to worker 2. Because transferring data between processes is slow, you want to do the reduction (calling sum
) on the worker and then only transfer the result back, which is a single number and
julia> fetch(@spawnat 2 sum(localpart(adist)))
1.2494661338187964e6
does exactly this. I hope this helps understand what is happening here.