Should you use @spawnat/@everywhere within a package? (Distributed.jl)

How should a package use Distributed.jl? The goal is that you have a package with some functions, and then you execute MyPackage.run() in your main process and the package in the background does all the ‘clever’ parallelisation.

The package would be imported something like this:

using Distributed
addprocs(4)
using MyPackage

MyPackage.run()

It turns out you cannot do this.

Take for example:

using Distributed
addprocs(4)

module Good

using Distributed

@everywhere function Hello()
    println("Hello")
    return "Hello"
end

end

@everywhere Hello()

This code works fine, but if we extract the module “Good” into a package. We face the following problem:

Package source:
DistributingTestPackage.jl

module DistributingTestPackage

using Distributed

@everywhere function Hello()
    println("Hello")
    return "Hello"
end

@everywhere println("Hello World!")

end

Main code:

using Distributed
addprocs(4)
using DistributingTestPackage

@everywhere Hello()

Running this code gives the following error:

On worker 2:
UndefVarError: `Hello` not defined

In other words, extracting a module into a package changes the way it interacts with Distributed.jl.

Of course the problems above can be easily worked around by doing something like

using DistributingTestPackage
@everywhere using DistributingTestPackage

and then calling the functions in some other way.

However, requiring users of a package to call @everywhere using … every time they use the package feels like this is not ‘best practice’. Therefore, what is the proper way of implementing multiprocessing into a package? Does anybody know any package which properly implements multiprocessing on e.g. JuliaHub that I can have a look at? (Possibly even in a simulation context)

Some context: I have a package which runs some Monte Carlo simulation that needs to run on N cores, afterwards the Monte Carlo results need to be averaged, etc… Due to the complexity, it would be nice to extract the Monte Carlo code into a package such that it can be reused.

1 Like

The problem is that you’re mixing up “precompile-time” statements with “runtime statements” - anything at the top-level of a module is evaluated during precompile, and generally should not have side effects outside of just defining methods and variables within the function itself. Your @everywhere violates this rule by trying to define a function on other workers during a single worker’s precompilation, which isn’t really a valid thing to do. Additionally, doing @everywhere println("Hello World!") similarly violates this rule, as it is a side-effect that is really intended for runtime, not precompile time.

The solution to this is generally simple and is somewhat automatically handled for you: you create the package DistributingTestPackage and define Hello() as a regular function - no @everywhere. Then, users will load it into the REPL (after they’ve loaded Distributed and created their workers), and Distributed will attempt to load your DistributingTestPackage on all of their workers automatically, and in the background. Users then can do @everywhere DistributingTestPackage.Hello() to call Hello() on all workers. All of the uses of @everywhere thus happen during runtime, while Hello was defined at precompile time (and thanks to Distributed automatically loading DistributingTestPackage everywhere, Hello becomes available everywhere).

You can alternatively define a function HelloAll() = @everywhere Hello() within your package if you don’t want users to need to call @everywhere themselves, which makes sense especially when you start doing more fancy computation and data movement within your simulation.

What I would recommend for you is to not use Distributed within your package at all - Distributed doesn’t really provide a good set of tools for library authors to use to write distributed libraries; it’s really too simple to make a good library-facing interface for most real libraries.

Instead, I would recommend you take a look at a package like Dagger.jl, which integrates with Distributed workers and gives you a ton of tools to implement automatically distributed (and multithreaded!) libraries. It lets you avoid having to deal with distributed computing, and instead you just define tasks and how they depend on each other - Dagger handles the rest. It also handles data movement, load balancing, latency hiding, scheduling, and much more for you automatically. Check out the documentation for a quick start and introduction.

(Note: I am the maintainer of Dagger.jl, so I may be biased, but I still believe the above to be true)

5 Likes