@everything copy absolutely everything in main process

s-baumann · February 12, 2021, 4:28pm

I have a long script that has alot of functions defined and packages included. Then there is a big for loop at the end I want to parallelise. For a minimal example:

using Distributed
Distributed.addprocs(4)
@everywhere using Distributions
@everywhere pdfaddone(x)  = pdf(Normal(), x + 1)
@everywhere rootaddtwo(x) = sqrt(x + 2)
@everywhere const arr     = collect(1:10)
@distributed vcat for x in arr
    pdfaddone(rootaddtwo(x))
end
Distributed.rmprocs(4)

Now I need to have the @everything macro everywhere to send it all to the worker processes. I am not a fan of this because it makes the script harder to read by adding more noise. It also means the logic for distributing the calculation is mixed with the calculation logic, ideally I would like to encapsulate the parallel logic somewhere else.
Is there a way to just add processes that already have absolutely everything as it exists in the main process (For this application I don’t really care if this is wasteful in terms of copying stuff not needed in the parallel calculation - I could clean it up later if it because a problem). So basically do something like:

using Distributions
pdfaddone(x)  = pdf(Normal(), x + 1)
rootaddtwo(x) = sqrt(x + 2)
const arr     = collect(1:10)

using Distributed
Distributed.addprocs(4)
@everywhere everything_in_main_process
@distributed vcat for x in arr
    pdfaddone(rootaddtwo(x))
end
Distributed.rmprocs(4)

affans · February 12, 2021, 4:52pm

I don’t think something like that exists. What you could do is use a combination or Revise.jl and include().

So create your main code file, say main.jl that has all your code. Then in the REPL (or a new file), you can do something like

using Revise # loads revise on the main process
using Distributed
addprocs(4)
@everywhere includet("main.jl") # or using ModuleA if wrapping your code in a module.

Now you can go back to main.jl and any changes you make will be reflected across all workers. Note there are some limitations to what revise can do.

josuagrw · February 12, 2021, 5:16pm

How about a block?

@everywhere begin
    using A
    using B
    # …
end

This also works without indentation; so basically it just means adding one line at the top and one at the bottom of the part of your program that should run on all processes

s-baumann · February 12, 2021, 5:55pm

Thanks. These are both good suggestions.

It occurred to me that with Julia metaprogramming it might be possible for a macro to detect what needs to be sent to the other processes and then include it. You could go through the function bodies of whatever functions are called at the top level in the parallelised loop. Then detect what Modules or other functions they call. Then record the modules and look into the bodies of the called functions and so on recursively. When you have got to the end you assemble a big @everywhere statement to send everything that is needed. I haven’t used the metaprogramming stuff much though so maybe this is not possible.

jpsamaroo · February 12, 2021, 6:08pm

If you put all of your functions and constants into a package, and then load the package (using MyPackage) after loading Distributed, then the functions will be defined on every worker automatically. Then you just need to use @everywhere, @spawnat, or remotecall to send data back and forth between workers. I believe Revise also works well if loaded after Distributed and before your package.

Topic		Replies	Views
Copying variable to all remote processes? General Usage parallel	5	1449	July 19, 2019
How do I push a value to all processes? General Usage distributed	4	496	June 23, 2019
Should you use @spawnat/@everywhere within a package? (Distributed.jl) General Usage question , distributed	1	197	March 26, 2024
@everywhere inside a module General Usage	11	165	February 21, 2025
Confusing behaviour of include("source.jl") in parallel codes General Usage question , parallel , distributed	6	234	June 7, 2024

@everything copy absolutely everything in main process

Related topics