Problems parallelizing declaring used packages

Hi! I am having big trouble setting up a code for parallelization. When I try to “declare” the packages that I am using in the workers by running the following code:

#--------------------------------#
#         Initialization         #
#--------------------------------#

# Packages
#using Distributed

using QuantEcon, Optim, Distributions, DelimitedFiles, ExcelReaders
using ProgressMeter, BenchmarkTools, DataFrames, Combinatorics
using LinearAlgebra, Statistics, Random, StatsBase
using BlackBoxOptim, Distributed

# Number of cores/workers
addprocs(2)

@everywhere begin
    using Pkg; Pkg.activate(".")  # required
    using QuantEcon, Optim, Distributions, DelimitedFiles, ExcelReaders
    using ProgressMeter, BenchmarkTools, DataFrames, Combinatorics
    using LinearAlgebra, Statistics, Random, StatsBase
    using BlackBoxOptim, Distributed
end

# Set Directory

cd("XXX\\julia_codes")

I get the following error:

<strong>On worker 2:</strong>

<strong>ArgumentError: Package QuantEcon not found in current path:
- Run `import Pkg; Pkg.add("QuantEcon")` to install the QuantEcon package.

require at .\loading.jl:823
top-level scope at XXX\julia_codes\4-Simulations.jl:26
eval at .\boot.jl:328
#116 at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\process_messages.jl:276
run_work_thunk at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\process_messages.jl:56
run_work_thunk at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\process_messages.jl:65
#102 at .\task.jl:259
#remotecall_wait#154(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Function, ::Distributed.Worker, ::Module, ::Vararg{Any,N} where N) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:421
remotecall_wait(::Function, ::Distributed.Worker, ::Module, ::Vararg{Any,N} where N) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:412
#remotecall_wait#157(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Function, ::Int64, ::Module, ::Vararg{Any,N} where N) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:433
remotecall_wait(::Function, ::Int64, ::Module, ::Vararg{Any,N} where N) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:433
(::getfield(Distributed, Symbol("##161#163")){Module,Expr})() at .\task.jl:259

...and 7 more exception(s).</strong>

in top-level scope at [stdlib\v1.1\Distributed\src\macros.jl:183](#)

in remotecall_eval at [stdlib\v1.1\Distributed\src\macros.jl:199](#)

in macro expansion at [base\task.jl:245](#)

in sync_end at [base\task.jl:226](#)

Hi,
The issue here is that your worker processes aren’t running in the same environment as the main process. By default new workers will launch in the default environment. For some reason (on julia 1.1.1, linux)

@everywhere Pkg.activate(".")

doesn’t seem to activate the current directory’s project on the workers. I’m not sure if that’s a bug or the intended behaviour.
Edit: The fact that

@everywhere activate(".")

doesn’t work is by design. There is an open issue (Workers should inherit Pkg environment · Issue #28781 · JuliaLang/julia · GitHub) discussing why that is the case and how the situation can be improved. Seems like the workaround I describe below (which I came across on another discourse thread: Packages and workers - #9 by teored90) is still the way to go.

In any event, you can get the workers to launch in the environment defined in the current directory’s Project.toml file by passing the exeflags keyword argument to `addprocs, like so:

addprocs(2, exeflags="--project=.")

With the workers launched like that I’m able to load packages on the workers that are only installed in the current environment.

1 Like

Thanks for the reply. It still does not work form me. I am only replacing my addprocs line for the one you suggested and deleting the Pkg.activate(). Is that correct?

Hmmm. Are you also activating the right environment on the master process? You’ll either have to start julia with the --project=. flag. E.g. from the shell run

$ julia --project=.

or start julia as normal and run

using Pkg; Pkg.activate(".")

on the master process before loading your packages.

1 Like

I am exactly doing that and it does not work. I am so puzzled! This is my code

using Pkg; Pkg.activate(".")

using Distributed


using QuantEcon, Optim, Distributions, DelimitedFiles, ExcelReaders
using ProgressMeter, BenchmarkTools, DataFrames, Combinatorics
using LinearAlgebra, Statistics, Random, StatsBase
using BlackBoxOptim, BitOperations

#Number of cores/workers
addprocs(2, exeflags="--project=.")

#Initializing packaes in workers
@everywhere begin
    using QuantEcon, Optim, Distributions, DelimitedFiles, ExcelReaders
    using ProgressMeter, BenchmarkTools, DataFrames, Combinatorics
    using LinearAlgebra, Statistics, Random, StatsBase
    using BlackBoxOptim, BitOperations
end

and the error message I get

On worker 2:

<strong>ArgumentError: Package QuantEcon not found in current path:
- Run `import Pkg; Pkg.add("QuantEcon")` to install the QuantEcon package.

require at .\loading.jl:823
top-level scope at C:\XXX\julia_codes\6-Simulations_par.jl:30
eval at .\boot.jl:328
#116 at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\process_messages.jl:276
run_work_thunk at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\process_messages.jl:56
run_work_thunk at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\process_messages.jl:65
#102 at .\task.jl:259
#remotecall_wait#154(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Function, ::Distributed.Worker, ::Module, ::Vararg{Any,N} where N) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:421
remotecall_wait(::Function, ::Distributed.Worker, ::Module, ::Vararg{Any,N} where N) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:412
#remotecall_wait#157(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Function, ::Int64, ::Module, ::Vararg{Any,N} where N) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:433
remotecall_wait(::Function, ::Int64, ::Module, ::Vararg{Any,N} where N) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:433
(::getfield(Distributed, Symbol("##161#163")){Module,Expr})() at .\task.jl:259

...and 1 more exception(s).</strong>

in top-level scope at [stdlib\v1.1\Distributed\src\macros.jl:183](#)

in remotecall_eval at [stdlib\v1.1\Distributed\src\macros.jl:199](#)

in macro expansion at [base\task.jl:245](#)

in sync_end at [base\task.jl:226](#)

Henh. That is very strange. I’m sorry, I’m out of ideas. Not sure what’s happening on your machine.

1 Like

Thanks a lot anyways!

Does someone have an idea about this issue? @ksmcreynolds

@Pbellive can you please tell me in which order would you recommend running the above code?

Have you tried something like this:

@everywhere begin
    using Pkg; Pkg.activate(path_to_env)
    ....
end

Edit: Also, in case it is not obvious, Pkg.activate(".") depends on the working directory, so it will not activate the same environment in every case.

To help diagnose your issue, you might compare the output of Base.load_path() on master vs worker:

Base.load_path()
remotecall_fetch(Base.load_path, 2)
1 Like

Thanks for the reply:

I am running

using Pkg; Pkg.activate(".")

using Distributed

addprocs(2, exeflags="--project=.")

when I compare the outputs of Base.load_path() I get

Base.load_path()

"C:\\Users\\jmcastro\\Project.toml"
"C:\\Users\\jmcastro\\.juliapro\\JuliaPro_v1.1.1.1\\environments\\v1.1\\Project.toml"
"C:\\Users\\jmcastro\\AppData\\Local\\JuliaPro-1.1.1.1\\Julia-1.1.1\\share\\julia\\stdlib\\v1.1"

remotecall_fetch(Base.load_path, 2)

"C:\\Users\\jmcastro\\Project.toml"
"C:\\Users\\jmcastro\\.julia\\environments\\v1.1\\Project.toml"
"C:\\Users\\jmcastro\\AppData\\Local\\JuliaPro-1.1.1.1\\Julia-1.1.1\\share\\julia\\stdlib\\v1.1"

then, what differs is the second path in each case.

Finally, when I’ll do

using QuantEcon, Optim, Distributions, DelimitedFiles, ExcelReaders
using ProgressMeter, BenchmarkTools, DataFrames, Combinatorics
using LinearAlgebra, Statistics, Random, StatsBase
using BlackBoxOptim, BitOperations

I get the error

<strong>On worker 2:</strong>

<strong>ArgumentError: Package ExcelReaders [c04bee98-12a5-510c-87df-2a230cb6e075] is required but does not seem to be installed:
- Run `Pkg.instantiate()` to install all recorded dependencies.

_require at .\loading.jl:929
require at .\loading.jl:858
#2 at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\Distributed.jl:77
#116 at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\process_messages.jl:276
run_work_thunk at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\process_messages.jl:56
run_work_thunk at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\process_messages.jl:65
#102 at .\task.jl:259
#remotecall_wait#154(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Function, ::Distributed.Worker) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:421
remotecall_wait(::Function, ::Distributed.Worker) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:412
#remotecall_wait#157(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Function, ::Int64) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:433
remotecall_wait(::Function, ::Int64) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:433
(::getfield(Distributed, Symbol("##1#3")){Base.PkgId})() at .\task.jl:259</strong>

Is it possible that this issue is related somehow to JuliaPro?

It is quite strange that you get an error from worker 2 when it seems you have simply executed using statements (without everywhere).
Is that so?

I tend to run all using processes on the main proc (as this triggers precomp).
Then add ONE worker.
Then do „everywhere“ using.

Have you tried that?

Also, did you run Pkg.instantiate on the main process? (After pkg.activate)

Most likely, especially since load_path() on workers is different from master process.

See Load modules in several workers - #19 by greg_plowman