How am I supposed to use pmap?

I have a large project where one function includes a pmap call, and I’m not sure the right way to get the code in scope on all the workers.

  1. @everywhere include("code.jl") where code.jl uses all the packages and includes my functions. Then I run a function from my code. This doesn’t work and gives me a lot of warnings/errors about reloading modules
  2. Wrapping my whole code in a module MyModule doing include("MyModule.jl") then using MyModule. This doesn’t work, MyModule isn’t defined on the workers.
  3. @everywhere include("MyModule.jl"), then using MyModule. The first line gives a ton of warnings about replacing modules from packages, which surprised me because I haven’t used MyModule yet. It works fine except for all the warnings.

Works fine with me.

julia> @everywhere module dummyModule
           function dummyFunction(i)
               i^2
           end
       end

julia> pmap(dummyModule.dummyFunction, 1:10)
10-element Array{Any,1}:
   1
   4
   9
  16
  25
  36
  49
  64
  81
 100

You may have forgotten to use @everwhere workspace() to clear the previously defined modules from previous failed attempts. include() just copies the code from the file, so it defines the module again everytime you use it.

To avoid warnings, import EveryPackage first, then @everywhere using EachPackage.

Alternatively, you can add the path containing your module to LOAD_PATH:

push!(LOAD_PATH,"YourPath")
@everywhere using MyModule

I think there is a way to permanently add a path to the LOAD_PATH instead.

You can add the push!(LOAD_PATH,...) statement to your julia startup file (i.e. ~/.juliarc.jl on Linux).

I tried adding the import statement for every package inside my module, then using the packages. It didn’t help. Do you mean the module should use `@everywhere using`` ?

I also tried push!(LOAD_PATH, pwd()) because the current directory contains mymodule.jl, but using MyModule doesn’t work (MyModule not found in current path).

It is working fine except for the warnings, so this isn’t really a problem that needs to be solved, but I feel like I’m not approaching this in the proper manner.

Not inside module, but in the script you run. e.g.,

import Plots
import MyModule
@everywhere using Plots
@everywhere using MyModule

The warnings appear when I load the code for the module.

@everywhere include("mymodule.jl")
WARNING: replacing module ForwardDiff.
WARNING: replacing module ForwardDiff.
WARNING: replacing module ForwardDiff.
WARNING: replacing module ForwardDiff.
WARNING: replacing module BenchmarkTools
WARNING: replacing module BenchmarkTools
WARNING: replacing module BenchmarkTools
WARNING: replacing module BenchmarkTools
......
using MyModule

After that it works. I don’t find @everywhere using to be necessary.

include("mymodule.jl")
@everywhere using MyModule
julia> include("mymodule.jl")

julia> @everywhere using MyModule
ERROR: On worker 2:
ArgumentError: Module MyModule not found in current path.
Run `Pkg.add("MyModule")` to install the MyModule package.

Doesn’t work.

Do you have a minimal example?

I typically use this approach:

  1. Put everything you need distributed to workers into a Module.
  2. Save Module to filesystem
  3. Ensure path to Module is in LOAD_PATH
  4. addprocs()
  5. using Module

Notes:

  • It is important to addprocs before using Module
  • You shouldn’t need @everywhere
  • You shouldn’t get warnings about replacing module.

Here’s an example:

module ModuleA
    export testA
    function testA(x)
        println(x, " -> ", x*x)
        return x*x
    end
end

Save this to file ModuleA.jl (or ModuleA/src/ModuleA.jl)

Now run this:

addprocs()

# ensure module is in LOAD_PATH
# in this example, module A was saved to working directory (but can be any directory)
thisDir = dirname(@__FILE__())
any(path -> path == thisDir, LOAD_PATH) || push!(LOAD_PATH, thisDir)

using ModuleA

pmap(testA, 1:10)

And here’s the output:

Julia-0.5.2> include("loading_pmap.jl")
        From worker 2:  1 -> 1
        From worker 3:  4 -> 16
        From worker 5:  2 -> 4
        From worker 4:  3 -> 9
        From worker 2:  5 -> 25
        From worker 5:  6 -> 36
        From worker 4:  7 -> 49
        From worker 3:  8 -> 64
        From worker 2:  9 -> 81
        From worker 4:  10 -> 100
10-element Array{Any,1}:
   1
   4
   9
  16
  25
  36
  49
  64
  81
 100

This worked well thanks. I realized my problem with the LOAD_PATH approach I tried earlier. The module file needs to have the same name as the module, so ModuleA.jl for ModuleA.

I did get a mysterious error after running my code though.

using PyPlot
WARNING: An error occurred during inference. Type inference is now partially disabled.
Base.MethodError(f=typeof(Core.Inference.convert)(), args=(Base.AssertionError, "invalid age range update"), world=0x0000000000000abf)
... hundreds of lines of errors

This is probably unrelated to the current situation though. Maybe an 0.6 RC1 bug.

Using the example from @greg_plowman works for me too but if I change the pmap command to be:

pmap(t -> testA(t), 1:10)

Then the worker cannot find testA. Any idea how to resolve this?

pmap(t -> ModuleA.testA(t), 1:10)

Thanks. Using @everywhere using ModuleA instead also works.