Adding data to worker processes via @everywhere

Hi all,

I know next to nothing about parallel processing, so apologies if this question is obvious. I start julia with julia -p 2. Here is a broken simplified version of what my code does:

d = read(some_filepath) # d is really big and takes up most of my RAM
some_vector = read(some_other_filepath)
@everywhere n = 1
while n <= N 
    @everywhere current_d_small = some_function_available_everywhere(d, n)
    @everywhere anonymous_function = (x -> another_function_available_everywhere(x, current_d_small)
    y = pmap(anonymous_function, some_vector)
    n += 1
end

This code obviously doesn’t work, because of the line:

 @everywhere current_d_small = some_function_available_everywhere(d, n)

This line errors because d is not available everywhere. But I can’t make d available everywhere, since a single copy of it takes up most of my RAM.

For the actual work done in pmap I only need current_d_small to be available to all workers, which is much smaller than d. I’ve read the docs, but can’t seem to work out how, on each outer iteration, to get a copy of current_d_small to each of the workers so I can make the pmap call.

Any help would be much appreciated. Also, bear in mind, I know very little about parallel processing, so maybe I should be using something completely different to pmap here.

Thanks,

Colin

If all workers are located on the same machine, you might want to consider threading instead of the functionality in Distributed. Threading uses shared memory and is in general more efficient due to less overhead, but also slighly trickier since you must ensure that all operations are thread safe. For example, IO is not currently thread safe. See the @threads macro to parallelize a for-loop. Also see the manual on threading.

Your example code does not make sense since you are not using y anywhere, but using threads might look something like

Threads.@threads for n = 1:N
    current_d_small = some_function_available_everywhere(d, n)
    anonymous_function = (x -> another_function_available_everywhere(x, current_d_small)
    y = map(anonymous_function, some_vector) # What to do with y?
end

You also have a threaded map in ThreadedMap.jl and KissThreading.jl

You could also check SharedArrays. Your function some_function_available_everywhere as sell as another_function should be able to work on chunks of data so there will be manual work involved …

If the above approaches don’t work for you, I’d also recommend checking out ParallelDataTransfer.jl.

Try @everywhere current_d_small = some_function_available_everywhere($d, n)

Thanks for responding. I’ll look into threading. I’d deliberately avoided it up to this point because it was still considered experimental. It sounds like this is only the case for IO now, which doesn’t apply to my current situation.

Regarding y, I store it in a pre-allocated array defined outside the outer loop. I omitted that part for simplicity, but perhaps it affects the ability to apply @threads?

Thanks again.

I was already starting to wonder if I should learn about these. Thanks for confirming that I need to :slight_smile:

Oh wow that seems to do exactly what I was asking for! Many thanks. Chris’s contributions pop up everywhere!

Thanks for responding. I’m afraid this doesn’t work since it just results in Julia checking for an interpolation of d on the other workers (and it isn’t there).

Unfortunately the loop I want to do in parallel includes a lot of random number generation at different points. The docs suggest a way around this when using @threads, but it doesn’t look that fun for a newbie to working in parallel. I might try one of the other solutions. Thanks again.

Unless you access the same part of the array simultaneously from different threads, there should be no problems with this pattern.

ParallelDataTransfer.jl is great, but beware that it does copy the sent variable to the memory of each worker, which could be a problem if I understand OP.

No, I think this will work fine, as long as I can send current_d_small without having to also send d. I only need current_d_small for the parallel routine and it is much smaller than d. Am looking into all the various options now :slight_smile: