Distributed generation of a DArray

parallel

#1

I have generated arrays on separate processes, and cannot collect them into a single array on the master process because it cannot fit into memory. Is there a way to directly generate the DArray without having to put it on the master process?


#2

It is described in https://github.com/JuliaParallel/DistributedArrays.jl#constructing-distributed-arrays. How did you construct the arrays on the workers?


#3

Minimal example of what I am trying, given my current understanding of the README:

addprocs()
N = 10
@everywhere begin
  srand(myid())
  r = rand(N)
end
using DistributedArrays
dfill(r,((N for i in 1:nprocs())...))

But the dfill doesn’t grab the part from each process. And what are you supposed to do if the array is not the same name on every process?


#4

You should try to avoid defining variables on the workers with @everywhere. The best approach is to define the local parts with a closure, i.e. something like

DArray((N*nworkers(),)) do I
    srand(myid())
    rand(N) # or rand(length.(I))
end

Alternatively, you can create the local parts and keep a reference to them with a Future and then construct a DArray like

rAs = [@spawnat p rand(N) for p in workers()]
4-element Array{Future,1}:
 Future(2, 1, 382, #NULL)
 Future(3, 1, 383, #NULL)
 Future(4, 1, 384, #NULL)
 Future(5, 1, 385, #NULL)

DArray(rAs)

#5

Thanks. Those are definitely a better way of handling it.

Is the DArray constructor able to handle the same stuff as pmap? For example, does it need to use a CachingPool for large anonymous functions? Is it essentially a batch_size = N/nprocs() kind of pmap?


#6

Is it essentially a batch_size = N/nprocs() kind of pmap?

Yes, I think that is the right way to think of it. The closure is executed once on each of the workers specified (all if none are specified) so in that sense, the batch size is N/nworkers(). Notice that nprocs() > 1 ? nworkers() == nprocs() - 1 : nworkers() == 1.