Did Julia code loading in distributed computing changed?

Importing a module after addprocs() seems to try to load it an all workers.

(@v1.6) pkg> activate .
  Activating environment at `C:\Users\blablabla...\Project.toml`

(MyProject) pkg> status
     Project BoseHubbardLyapunovExponents v0.1.0
      Status `C:\Users\okidoki....\Project.toml`
  [2b5f629d] DiffEqBase v6.60.0
  [0c46a032] DifferentialEquations v6.16.0
  [634d3b9d] DrWatson v2.0.2
  [61744808] DynamicalSystems v1.7.4
  [6e36e845] DynamicalSystemsBase v1.8.2
  [59287772] Formatting v0.4.2
  [f67ccb44] HDF5 v0.15.4
  [b964fa9f] LaTeXStrings v1.2.1
  [961ee093] ModelingToolkit v5.16.0
  [429524aa] Optim v1.3.0
  [91a5bcdd] Plots v1.13.2
  [438e738f] PyCall v1.92.3
  [d330b81b] PyPlot v2.9.0
  [8ba89e20] Distributed
  [37e2e46d] LinearAlgebra
  [56ddb016] Logging

julia> using Distributed

julia> addprocs()
8-element Vector{Int64}:
 2
 3
 4
 5
 6
 7
 8
 9

julia> using HDF5
ERROR: On worker 2:
ArgumentError: Package HDF5 [f67ccb44-e63f-5c2f-98bd-6dc0ccc4ba2f] is required but does not seem to be installed:
 - Run `Pkg.instantiate()` to install all recorded dependencies.

Stacktrace:
 [1] _require
   @ .\loading.jl:990
 [2] require
   @ .\loading.jl:914
 [3] #1
   @ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\Distributed.jl:79
 [4] #103
   @ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\process_messages.jl:274
 [5] run_work_thunk
   @ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\process_messages.jl:63
 [6] run_work_thunk
   @ C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\process_messages.jl:72
 [7] #96
   @ .\task.jl:406

...and 7 more exceptions.

Stacktrace:
 [1] sync_end(c::Channel{Any})
   @ Base .\task.jl:364
 [2] macro expansion
   @ .\task.jl:383 [inlined]
 [3] _require_callback(mod::Base.PkgId)
   @ Distributed C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Distributed\src\Distributed.jl:76
 [4] #invokelatest#2
   @ .\essentials.jl:708 [inlined]
 [5] invokelatest
   @ .\essentials.jl:706 [inlined]
 [6] require(uuidkey::Base.PkgId)
   @ Base .\loading.jl:920
 [7] require(into::Module, mod::Symbol)
   @ Base .\loading.jl:901

julia>
julia> versioninfo()
Julia Version 1.6.0
Commit f9720dc2eb (2021-03-24 12:55 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, skylake)

This is the behavior as far back as I remember. Note that by default addprocs() doesn’t change the workers’ active environment to match the main process so you may want something like addprocs(exeflags="--project=.").

2 Likes

I am not using @everywhere in front of using HDF5. I want to load the module only on the main process, not on the worker processes. I can not remember that all modules are loaded on all workers.

You can either load it before adding the distributed workers, or @eval using HDF5 after will load it only on the main process.

1 Like

ok. what is the @everywhere macro then for? Just for user functions and structs?

Yea, those, and in general anything you want evaluated everywhere. It just so happens that using Foo already automatically loads Foo on all workers (but note that it doesn’t import it on the workers, which is a subtle difference).

1 Like

Ah, I see! Thank you. I have somehow managed to never stumble over this in 2 years of using Julia. :grinning: