How can I use parallelism (on multiple processes) in a julia application/package?
I have an application (in the Pkg3 sense that is not a package) in which I use pmap.
I tried to load the package with using ParallelTest after doing
using Distributed
addprocs(2)
but I get an error saying that the package is not installed.
Steps to reproduce:
(v0.7) pkg> generate ParallelTest
Generating project ParallelTest:
ParallelTest\Project.toml
ParallelTest/src/ParallelTest.jl
shell> cd ParallelTest
C:\Users\memo\Documents\julia\ParallelTest
(v0.7) pkg> activate .
julia> using Distributed
julia> addprocs(2)
2-element Array{Int64,1}:
2
3
julia> using ParallelTest
[ Info: Precompiling ParallelTest [0e712700-ac5e-11e8-3696-ef47f8f04c4e]
ERROR: On worker 2:
ArgumentError: Package ParallelTest [0e712700-ac5e-11e8-3696-ef47f8f04c4e] is required but does not seem to be installed:
- Run `Pkg.instantiate()` to install all recorded dependencies.
_require at .\loading.jl:923
require at .\loading.jl:852
#2 at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\Distributed.jl:77
#116 at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\process_messages.jl:276
run_work_thunk at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\process_messages.jl:56
run_work_thunk at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\process_messages.jl:65
#102 at .\task.jl:262
#remotecall_wait#154(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Function, ::Distributed.Worker) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\remotecall.jl:407
remotecall_wait(::Function, ::Distributed.Worker) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\remotecall.jl:398
#remotecall_wait#157(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Function, ::Int64) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\remotecall.jl:419
remotecall_wait(::Function, ::Int64) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\remotecall.jl:419
(::getfield(Distributed, Symbol("##1#3")){Base.PkgId})() at .\task.jl:262
...and 1 more exception(s).
Stacktrace:
[1] sync_end(::Array{Any,1}) at .\task.jl:229
[2] macro expansion at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\Distributed.jl:75 [inlined]
[3] macro expansion at .\task.jl:247 [inlined]
[4] _require_callback(::Base.PkgId) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v0.7\Distributed\src\Distributed.jl:74
[5] #invokelatest#1 at .\essentials.jl:691 [inlined]
[6] invokelatest at .\essentials.jl:690 [inlined]
[7] require(::Base.PkgId) at .\loading.jl:855
[8] macro expansion at .\logging.jl:311 [inlined]
[9] require(::Module, ::Symbol) at .\loading.jl:834
If all you want is use a method from Paralleltest containing pmap on your main process, you have to do using Paralleltest before adding processes. The second process doesn’t need Paralleltest per se, but it needs the function given to it:
help?> pmap
Search: pmap promote_shape typemax PermutedDimsArray process_messages
pmap(f, [::AbstractWorkerPool], c...; distributed=true, batch_size=1, on_error=nothing, retry_delays=[], retry_check=nothing) -> collection
Transform collection c by applying f to each element using available workers and tasks.
For multiple collection arguments, apply f elementwise.
Note that f must be made available to all worker processes; see Code Availability and Loading Packages for details.
#<snip>
Thanks for the answers!
I am trying to load some files on all workers, but I am not sure how to do it properly.
Considering the example in the first post, I changed the contents of ParallelTest.jl to:
__precompile__(false)
module ParallelTest
using Distributed
@everywhere using Pkg
@everywhere Pkg.activate(".")
@everywhere include("$(@__DIR__)/parallel.jl")
@everywhere using .PMod
greet() = print("Hello World!")
end # module
and added another file parallel.jl in the src directory containting:
module PMod
println("loaded")
end # module PMod
I managed to load the module successfully, but I get redefinition warnings for the module loaded in parallel.
(v0.7) pkg> activate .
julia> using Distributed
julia> addprocs(2)
2-element Array{Int64,1}:
2
3
julia> using ParallelTest
[ Info: Precompiling ParallelTest [0e712700-ac5e-11e8-3696-ef47f8f04c4e]
loaded
From worker 3: loaded
From worker 2: loaded
From worker 2: WARNING: replacing module PMod.
WARNING: replacing module PMod.
From worker 2: loadedWARNING: replacing module PMod.
From worker 3: WARNING: replacing module PMod.
loaded
loaded
From worker 2: WARNING: replacing module PMod.
From worker 2: loaded
From worker 3: loaded
From worker 3: WARNING: replacing module PMod.
From worker 3: loaded
I also tried to use using .PMod instead of @everywhere using .PMod, but I get a LoadError saying that PMod is not defined. I think that the problem is that on the workers it’s in the global scope, while on the master process is inside a module.
ArgumentError: Package ... is required but does not seem to be installed:
- Run `Pkg.instantiate()` to install all recorded dependencies.
e.t.c.
but I found the reason just now. The “startup.jl” file is only executed on the main process, not on the workers, so my “LOAD_PATH” did not include the required directories. On 0.6.4 the “juliarc.jl” was always executed on the workers also. Doing @everywhere println(LOAD_PATH) shows this. So now I’ll just need to find out how to get this LOAD_PATH to all my workers and I should be fine. No need for the Pkg.activate and all that stuff.
Have you been able to resolve this? I am struggling with the same issue for packages SpecialFunctions and Match… I have tried all suggested remedies (Pks.instantiate etc) to no avail.
This should work if you split the using MyPackage into its own @everywhere block (because the macro moves all import statements to the top of the block, breaking your original attempt).
Thanks for suggestion. Still having problems though, even just getting standard packages like ParallelDataTransfer, Interpolations…Specialfunctions to work – I’m not even attempting my own packages for now.
<<left out loading of all packages on master processor – this works without fail>>
using Distributed
addprocs(2) # verified that I now have 3 procs
@everywhere begin
using Pkg
Pkg.activate(“.”)
end
@everywhere using ParallelDataTransfer # FAILS SEE OUTPUT BELOW @everywhere using FFTW # OK! @everywhere using LinearAlgebra # OK! @everywhere using SpecialFunctions # FAILS @everywhere using LowRankApprox # FAILS @everywhere using Interpolations # FAILS @everywhere using Match # FAILS
The message generated for ParallelDataTransfer is
On worker 2: ArgumentError: Package ParallelDataTransfer not found in current path:
Run import Pkg; Pkg.add("ParallelDataTransfer") to install the ParallelDataTransfer package.
require at .\loading.jl:823
eval at .\boot.jl:328 #116 at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\process_messages.jl:276
run_work_thunk at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\process_messages.jl:56
run_work_thunk at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\process_messages.jl:65 #102 at .\task.jl:259 #remotecall_wait#154(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Function, ::Distributed.Worker, ::Module, ::Vararg{Any,N} where N) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:421
remotecall_wait(::Function, ::Distributed.Worker, ::Module, ::Vararg{Any,N} where N) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:412 #remotecall_wait#157(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Function, ::Int64, ::Module, ::Vararg{Any,N} where N) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:433
remotecall_wait(::Function, ::Int64, ::Module, ::Vararg{Any,N} where N) at C:\Users\julia\AppData\Local\Julia-1.1.1\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:433
(::getfield(Distributed, Symbol(“##161#163”)){Module,Expr})() at .\task.jl:259
…and 1 more exception(s).
in top-level scope at stdlib\v1.1\Distributed\src\macros.jl:183
in remotecall_eval at stdlib\v1.1\Distributed\src\macros.jl:199
in macro expansion at base\task.jl:245
in sync_end at base\task.jl:226
Have you confirmed that Pkg.activate(".") actually does what you think? "." assumes that your workers started in the directory you expect; maybe try @everywhere println(pwd()) before trying to load packages to confirm you are where you think you are.