Copying variable to all remote processes?

Hello all,

I’m working on some code where each remote worker needs to compute an array by accumulating values, and the arrays then need to be summed. Below is an absolute minimum working example to demonstrate the problem I’m encountering.

Obviously, the code below (lines 5 and 6) will not work because I did not use @everywhere when defining q. In my program though, “q” is the result of many functions and other variables, so I’m hoping there is another way to copy q to all workers without needing to use @everywhere in front of every one of q's dependencies.

Any guidance or suggestions would be much appreciated. Thanks!

using Distributed
addprocs(2)

q = reshape(rand(9), 3, 3)
@everywhere nrow = eval(size(q, 1))
@everywhere ncol = eval(size(q, 2))
for i in workers()
    @spawnat i eval(:(a = fill(0, nrow, ncol)))
end

In fact, I believe using @everywhere on all preceding variables will not work in my program because workers are created after some variables have already been defined.

After some random trial and error, this appears to work, but I’m not exactly sure why. :upside_down_face:

using Distributed
addprocs(2)
q = fill(rand(9), 3, 3)

@everywhere nrows = @eval size($q, 1)
@everywhere ncols = @eval size($q, 2)
for i in workers()
    @spawnat i eval(:(a = fill(0, nrows, ncols)))
end

You can avoid the @eval and just interpolate the sizes into @everywhere:

@everywhere nrows = $(size(q, 1))
@everywhere ncols = $(size(q, 2))
2 Likes

The solutions posted do work in the MWE, but for some reason they don’t work in my Julia package. Could it be something to do with function dependencies? Here’s the code that is encountering the problem. This is all inside a function, by the way, and parse_cfg and read_ascii are both functions in my package. FWIW, all of the code is available on GitHub here.

cfg = parse_cfg(path)
calc_flow_potential = cfg["calc_flow_potential"] == "true"
parallelize = cfg["parallelize"] == "true"
n_workers = parse(Int64, cfg["max_parallel"])
sources_raw = float(read_ascii("$(cfg["source_file"])"))

if parallelize
    println("Starting up Omniscape to use $(n_workers) processes in parallel")
    myaddprocs(n_workers)

    @everywhere nrows_remote = $(size(sources_raw, 1))
    @everywhere ncols_remote = $(size(sources_raw, 2))

    for i in workers()
        @spawnat i eval(:(cum_currmap = fill(0.,
                                             nrows_remote,
                                             ncols_remote)))
    end

    if calc_flow_potential
        for i in workers()
            @spawnat i eval(:(fp_cum_currmap = fill(0.,
                                                    nrows_remote,
                                                    ncols_remote)))
        end
    end
end

myaddprocs(n) is simply addprocs(n) then @everywhere Core.eval(Main, :(import Omniscape))

EDIT

I did get everything sorted by using remotefetch() instead, FWIW!


I found a super hacky workaround using pmap to effectively copy the variables to remote workers using the following: (this is clearly not an ideal solution)

function copyvars(i, array)
    global nrows_remote = size(array, 1)
    global ncols_remote = size(array, 2)
end

pmap(x -> copyvars(x, sources_raw), 1:50)

Any reason why this works while the script in the post just above doesn’t? Thanks again for any insight!