Sending modules from a local julia file to remote machines


#1

Hi,

I’m trying to run a Julia program comprising multiple modules on a remote machine. The relevant Julia files are available locally but not on the remote machine.
Adding processes on the remote machines seems to work fine. But trying to include Julia code fails. Explicitly, with

@everywhere include("MyModule.jl")

Julia complains that it cannot find the above Julia module file. Presumably, this happens because the file is not available on the remote machine.
How can I make the modules and functions contained in my local Julia files available on the remote machine?

Thanks!


#2

You may try something like this:

# read file as text
text = readstring("MyModule.jl")
# parse it into an expression
ex = parse(text)
# evaluate this expression on all workers
@everywhere eval(ex)

The downside is that if MyModule refers to other files or modules, they won’t be sent. You can handle some of such cases by recursively analyzing the content of included files, but it’s not very robust.

Alternatively, you can make MyModule available for downloading on all workers and then simply run:

Pkg.add("MyModule")

or

Pkg.checkout("git://<public or private repo>/MyModule.jl.git")

This way Julia will handle all the dependencies automatically. One disadvantage of this approach is that packages are installed globally which may be inconvenient on shared workers.


#3

Thanks for the detailed explanations and solutions. For the second solution, suppose I started several workers on the remote machine, wouldn’t they all download the package simultaneously? If so, would writing in parallel to the remote HD be problematic?


#4

I don’t know if the built-in package manager can handle simultaneous installation / lock on packages, but I would simply run Pkg.add() on each worker synchronously, something like this:

for w in workers()
    remotecall_wait(Pkg.add, w, "MyModule")
end

Since you need to install packages only once, this will be slow the first time you do it, but then Pkg will only check the package for existence.

Alternatively, you can remove duplicated from the list of worker hosts and run Pkg.add() only on a list of unique ones.


#5
# Include a file on a path not available on a remote worker
include_remote(path, 2)

From https://github.com/ChrisRackauckas/ParallelDataTransfer.jl