How to define a function on all worker processes?

parallel

#1

I’m trying to run the following code on Julia 0.4.5 via Jupyter (so I can’t load files):

function foo()
    return rand()
end

np = nprocs()
ap = 0
if np < CPU_CORES
    ap = addprocs(CPU_CORES - np)
end

@everywhere bar = foo()

for x in ap
    println(remotecall_fetch(x, () -> bar))
end

But I get the following exception:

On worker 9:
UndefVarError: foo not defined
 in eval at ./sysimg.jl:14
 in anonymous at multi.jl:1394
 in anonymous at multi.jl:923
 in run_work_thunk at multi.jl:661
 [inlined code] from multi.jl:923
 in anonymous at task.jl:63
 in remotecall_fetch at multi.jl:747
 in remotecall_fetch at multi.jl:750
 in anonymous at multi.jl:1396

...and 6 other exceptions.


 in sync_end at ./task.jl:413
 in anonymous at multi.jl:1405

According to the documentation:

You can force a command to run on all processes using the @everywhere macro. For example, @everywhere can also be used to directly define a function on all processes:

julia> @everywhere id = myid()

julia> remotecall_fetch(2, ()->id)
2

The docs aren’t very clear on how to actually define a function on all processes, and the example simply shows you how to run a function on all processes and then gets the return value.


#2

@everywhere bar = foo() defines the variable bar on all procs but only if the function foo() is available to all procs to define bar. The gist is use @everywhere function foo() when defining the function.


#3

Thanks. The problem then is that the function is defined before the worker processes are started, and this will always be the case since we start/stop workers based on how much work is to be done. How do I pass an already defined function to a new worker?


#4

Not sure how to do this. Let’s wait for someone more knowledgeable.


#5

I don’t know if this is the best way to do this or not, but here is a working example. Essentially you put all the functions you want your new worker to load into a separate file (test.jl in this example) and then instruct the worker to load that file.

shell> cat test.jl
f() = 1

julia> include("test.jl")
f (generic function with 1 method)

julia> f()
1

julia> addprocs(1)
1-element Array{Int64,1}:
 2

julia> remotecall_fetch(f, 2)
ERROR: On worker 2:
UndefVarError: #f not defined
 in deserialize_datatype at ./serialize.jl:823
 in handle_deserialize at ./serialize.jl:571
 in deserialize_msg at ./multi.jl:120
 in message_handler_loop at ./multi.jl:1317
 in process_tcp_streams at ./multi.jl:1276
 in #618 at ./event.jl:68
 in #remotecall_fetch#606(::Array{Any,1}, ::Function, ::Function, ::Base.Worker) at ./multi.jl:1070
 in remotecall_fetch(::Function, ::Base.Worker) at ./multi.jl:1062
 in #remotecall_fetch#609(::Array{Any,1}, ::Function, ::Function, ::Int64) at ./multi.jl:1080
 in remotecall_fetch(::Function, ::Int64) at ./multi.jl:1080

julia> remotecall_fetch(include, 2, "test.jl")
f (generic function with 1 method)

julia> remotecall_fetch(f, 2)
1

#6

Unfortunately I’m running inside Jupyter with a read only filesystem, so I can’t create files.


#7

Oops, I missed that part of your post.

How about something like this:

julia> expr = quote
           f() = 1
       end
quote  # REPL[1], line 2:
    f() = begin  # REPL[1], line 2:
            1
        end
end

julia> addprocs(1)
1-element Array{Int64,1}:
 2

julia> remotecall_wait(eval, 2, expr)
Future(2,1,3,Nullable{Any}())

julia> fetch(@spawnat 2 f())
1

This feels like bad practice, but maybe it works in a pinch?


#8

first off, why are you using julia 0.4.5 instead of 0.5.2 or even 0.6-rc2?

on >0.5 you can simply say @eval @everywhere bar = $(foo())


#9

Well, I mention this so often in bug reports that it should be my signature. We have customers that are change averse and we cannot expect them to upgrade very often. It took us several months to port all our code from 0.3 to 0.4 and given that 1.0 is imminent, we’ve decided to skip 0.5 and 0.6 and move straight to 1.0 which is expected to have long term support. If 0.6 is supposed to be forward compatible with 1.0, then we’ll consider moving to that sooner.

Will the above work in 1.0 as well?


#10

Thanks that sounds promising.


#11

I’m definitely making progress with this method. Still some complications because I’m using PyCall in the workers to call out to a python function, but I’ll figure that out. Thanks for your help.


#12

So this worked really well. Specifically it was the only way for me to call using ModuleName in a worker process. Since my workers were started after I’d called using in the main process, none of the types made it through, so I had to call using separately in each worker.