Precompiled packages when $HOME is shared among different systems?

Hi,
I’m running julia on our clusters. At the moment, the user $HOMES are shared between 2 clusters, as they new one is phased in.

Now I’d like to run MPI applications in this configuration. For convenience, I’m using MPItrampoline: GitHub - eschnett/MPItrampoline: A forwarding MPI implementation that can use any other MPI implementation via an MPI ABI

I instantiate a small project and can run MPI programs on the old cluster:

rkube@nid00009:~/julia_envs/xgc_analysis_cori> cat Project.toml
name = "xgc_analysis_cori"
uuid = "b1d4e0b1-884e-40f9-9bb8-eab53f6d8b80"
authors = ["Ralph Kube <ralph_kube@gmx.net>"]
version = "0.1.0"

[deps]
ADIOS2 = "e0ce9d3b-0dbd-416f-8264-ccca772f60ec"
MPI = "da04e1cc-30fd-572f-bb4f-1f8673147195"

[extras]
MPIPreferences = "3da0fdf6-3ccc-4f1b-acd9-58baa6c99267"
rkube@nid00009:~/julia_envs/xgc_analysis_cori> ls
LocalPreferences.toml  Manifest.toml  Project.toml  src
rkube@nid00009:~/julia_envs/xgc_analysis_cori> cat src/01-mpi-hello.jl
# examples/01-hello.jl
using MPI
MPI.Init()

comm = MPI.COMM_WORLD
print("Hello world, I am rank $(MPI.Comm_rank(comm)) of $(MPI.Comm_size(comm))\n")

rkube@nid00009:~/julia_envs/xgc_analysis_cori> srun -n 4 $HOME/software/julia-1.8.1/bin/julia --project=. src/01-mpi-hello.jl 
Hello world, I am rank 0 of 4
Hello world, I am rank 3 of 4
Hello world, I am rank 1 of 4
Hello world, I am rank 2 of 4

When I do exactly the same on the new cluster, julia tries to access the MPI libraries installed in the old system and the program crashes:

rkube@nid005276:~/julia_envs/xgc_analysis_pm> srun -n 4 $HOME/software/julia-1.8.1/bin/julia --project=. src/01-mpi-hello.jl 
ERROR: LoadError: InitError: could not load library "/opt/cray/pe/mpt/7.7.19/gni/mpich-gnu/8.2/lib/libmpich"


Stacktrace:
/opt/cray/pe/mpt/7.7.19/gni/mpich-gnu/8.2/lib/libmpich.so: cannot open shared object file: No such file or directory
Stacktrace:
  [1] dlopen(s::String, flags::UInt32; throw_error::Bool)
    @ Base.Libc.Libdl ./libdl.jl:117
  [2] dlopen
    @ ./libdl.jl:116 [inlined]
  [3] __init__()
    @ MPI ~/.julia/packages/MPI/08SPr/src/MPI.jl:66
  [4] _include_from_serialized(pkg::Base.PkgId, path::String, depmods::Vector{Any})
    @ Base ./loading.jl:831
  [5] _require_search_from_serialized(pkg::Base.PkgId, sourcepath::String, build_id::UInt64)
    @ Base ./loading.jl:1039
  [6] _require(pkg::Base.PkgId)
    @ Base ./loading.jl:1315
  [7] _require_prelocked(uuidkey::Base.PkgId)
    @ Base ./loading.jl:1200
  [8] macro expansion
    @ ./loading.jl:1180 [inlined]
  [9] macro expansion
    @ ./lock.jl:223 [inlined]
 [10] require(into::Module, mod::Symbol)
  [1] dlopen(s::String, flags::UInt32; throw_error::Bool)
    @ Base ./loading.jl:1144
during initialization of module MPI
in expression starting at /global/u2/r/rkube/julia_envs/xgc_analysis_pm/src/01-mpi-hello.jl:2

Does anyone have an idea how to separate julia installations for multiple systems?

Set JULIA_DEPOT_PATH to separate directories.

But from what I can gather from the error message, that shouldn’t have much to do with the depot. What version of MPI.jl are you using?

Hmm, perhaps one might want to use shared environments and have the environments in the depots. Then you can have different LocalPreferences.toml in each depot.

Thanks for the help. I’m solving the problem by

  1. Setting JULIA_DEPOT_PATH in my .bashrc based on the cluster I’m logging in to:
case $LOGIN_HOST in
    "cluster1")
        : 
        export JULIA_DEPOT_PATH = "${HOME}/.julia_host1"
        ;;
    "cluster2")
        : 
        export JULIA_DEPOT_PATH = "${HOME}/.julia_host2"
        ;;
esac

Then I’m installing MPI.jl on each cluster using the system’s MPI library .

Then I can set up environments on each cluster as usual.