Pmap with in place functions

I was hoping to use pmap with an in place function, but I’m finding the behavior is not as expected. This example illustrates the issue:

addprocs(2);
@everywhere function f!(x)
    @. x+= randn();
    x
end

X = [zeros(2) for i=1:10];
srand(100);
pmap(f!,X);

When I inspect X after running, I find that it still contains all zeros, which is not what I expected. If I don’t add any processes, and only have worker 1, it does work as I had hoped.

2 Likes

The way I overcame this is to have an intermediate function that first constructs the arrays and then calls the inplace function - and that is what I pass to pmap. Which really isn’t in place at all, but it does fine for my use case.

That sort of defeats the purpose of the inplace operation. The whole reason I have the inplace function to begin with is to avoid more allocations.

I’m in a very similar situation. For the sake of simplicity, I’d like to not make my code in to a module, and I want one function in the file to call the other in a parpool/pmap fashion.

I thought an easy way to do this would be to simply include the file with @everywhere, but this has proved to be tricky. My current attempt is failing.

I have a file named pmap_test.jl :

# pmap_test.jl

function do_the_work(x)
    x * 2
    sleep(1)
end

function do_the_pmap()
    num_workers = 8
    wpool = WorkerPool(addprocs(num_workers)) # make worker pool

    # make all functions in this file available to all workers
    @everywhere the_source_file = realpath(Base.source_path())
    @everywhere include(the_source_file)

    # use pmap with wpool
    pmap(wpool, do_the_work, collect(1:num_workers*3))

    # attempt to remove worker procs associated with wpool
    rmprocs(wpool.workers)
end

When I try to include this file and call do_the_pmap from a notebook, I get the following:

On worker 2:
SystemError: realpath: No error
#systemerror#44 at .\error.jl:64 [inlined]
systemerror at .\error.jl:64
realpath at .\path.jl:287
eval at .\boot.jl:235
eval_ew_expr at .\distributed\macros.jl:116
#106 at .\distributed\process_messages.jl:268 [inlined]
run_work_thunk at .\distributed\process_messages.jl:56
macro expansion at .\distributed\process_messages.jl:268 [inlined]
#105 at .\event.jl:73
#remotecall_fetch#141(::Array{Any,1}, ::Function, ::Function, ::Base.Distributed.Worker, ::Expr, ::Vararg{Expr,N} where N) at .\distributed\remotecall.jl:354
remotecall_fetch(::Function, ::Base.Distributed.Worker, ::Expr, ::Vararg{Expr,N} where N) at .\distributed\remotecall.jl:346
#remotecall_fetch#144(::Array{Any,1}, ::Function, ::Function, ::Int64, ::Expr, ::Vararg{Expr,N} where N) at .\distributed\remotecall.jl:367
remotecall_fetch(::Function, ::Int64, ::Expr, ::Vararg{Expr,N} where N) at .\distributed\remotecall.jl:367
(::##25#29)() at .\distributed\macros.jl:102

...and 40 more exception(s).


Stacktrace:
 [1] sync_end() at .\task.jl:287
 [2] macro expansion at .\distributed\macros.jl:112 [inlined]
 [3] do_the_pmap() at pmap_test.jl:13

Is there a straightforward way to include a simple .jl file with functions and make those functions usable on all workers?

Is there a straightforward way to include a simple .jl file with functions and make those functions usable on all workers?

@everywhere include("simple_jl_file_with_functions.jl")

If you’re desperate to avoid allocations and want to do in place operations, I’d try threading instead of pmap. But I wouldn’t expect that to be straightforward either. You don’t want separate processes writing into regions of memory that are close to each other.

Won’t threading limit the approach the algorithm to a shared memory environment?

Also, I’m really surprised that a multiprocessing/multithreading application of an in place function to, appropriately defined, data structures is not straightforward.

1 Like

I think that you should use SharedArrays when writing to an array from multiple processes. See https://docs.julialang.org/en/stable/manual/parallel-computing/#man-shared-arrays-1 and https://docs.julialang.org/en/stable/manual/parallel-computing/#Parallel-Map-and-Loops-1

I’d also look into DistributedArrays:

I actually just want to write separate files to disk in each process. It’s “embarrassingly parallel.”

I’ve heard this suggestion of SharedArrays before, but no one seems to have an example of using them together with pmap or some other parallel mechanism for such an in place manipulation.

The second link (https://docs.julialang.org/en/stable/manual/parallel-computing/#Parallel-Map-and-Loops-1) has an example with this and explains why your example does not work (each process has a separate copy of the array).
The idea is to replace the Array you are writing to from multiple processes with a SharedArray.