Import a module within that same module for parallelisation

s-baumann · August 17, 2019, 10:21pm

I have a bunch of functions within a module. I want to make a function that uses various other functions within that module in parallel. The solution that works in the REPL is to addprocs for the number of cores I have and then use @everywhere using ModuleName. Within ModuleName however that does not work because there is an error "syntax: “toplevel” expression not a top level. A minimal working example is:

module ModuleName
    using Distributed
    function do_something(x)
        return x^2
    end
    function do_manythings(Y::Array, num_cores::Integer)
        addprocs(num_cores)
        @everywhere using ModuleName
        squares = @distributed (vcat) for y in Y
            do_something(y)
        end
        rmprocs(num_cores)
        return squares
    end
    export do_something, do_manythings
end

Potentially there is an issue that the compiler would like to see ModuleName in compiling this code but doesn’t have it yet (but I don’t really know how compilers work…).

What is the proper way to use the functions of a module within that same module? I would ideally like to do it so that a user of ModuleName specifies how many cores he/she has to use but does not have to know or do anything else to get it to run in parallel.

marius311 · August 17, 2019, 10:38pm

You can use @everywhere @eval Mod1 using Mod2 to evaluate using Mod2 in Mod1’s namespace on the remote worker. Note @eval ModuleName expr can generally be used to evaluate anything into another module.

s-baumann · August 18, 2019, 2:42pm

This seems to work. I replace @everywhere using ModuleName with @everywhere @eval using ModuleName. Just to check my understanding of what this is doing though:

Is the @eval macro here acting to tell the compiler to ignore it and run the using command at runtime?
Will this mean that each core will compile ModuleName itself or will each just use the already compiled binaries done on the main Julia thread.

marius311 · August 18, 2019, 7:05pm

Not sure I understand exactly what you mean, but basically its the same thing as how you can’t just put using Foo inside the body of a function. But you can @eval using Foo inside a function to execute that statement in the toplevel scope of the current module.

As I just learned trying to answer this, looks like @everywhere is smart and will look through your expression, grab any import or using statements, and do those first on the master process, before sending the statement to workers. But what happens when it does using Foo on the master is special, and described here, especially the very last sentence. Basically you precompile Foo (if necessary) on the master process, load it on the master, then load it on the workers (which will use the compile cache). In your case you will then also execute using Foo on the workers, which will just bring Foo into scope and be instantaneous as Foo was already loaded.

s-baumann · August 18, 2019, 9:30pm

ok. This makes alot of sense. Thanks for this.

Topic		Replies	Views
Writing a module using paralellization with @everywhere and @parallel General Usage parallel , module	1	1114	October 13, 2021
Using local module with @everywhere Julia at Scale distributed	2	880	September 12, 2022
(Parallel) Using pmap within modules General Usage question	3	616	February 15, 2017
Distributed code loading inside modules General Usage question , parallel , module , distributed	8	1136	July 15, 2021
Parallel Computing in Julia General Usage	1	568	August 8, 2017

Import a module within that same module for parallelisation

Related topics