I’m an administrator of a shared compute cluster, and I’d like to provide a few packages in a sort of “base environment” for all users on the system. What’s the recommended workflow for this?
Specifically, I’d like to have MPI.jl and CUDA.jl pre-installed for all users, so I can make sure they’re configured correctly for our system.
Research Scientist II
Partnership for an Advanced Computing Environment (PACE)
Georgia Institute of Technology
At NERSC they have
JULIA_LOAD_PATH pointing to a place where they installed the packages. When you do
module load julia these environment variables get defined, so when you use Julia you have access to those packages. They do it especifically for things like MPI.jl
Unfortunately I am not familiar with the specifics.
@RonRahaman - you are at GT?
Cool, I am at GTRI.
If you could DM me your email, we could set up a time to chat - happy to assist.
Doesn’t DEPOT path cause compilation of packages like IJulia to error because they try to write to read-only path?
You can have multiple depots, and I seem to remember only the first one is supposed to be writable.
That would make sense. In nersc the first entry is the user depot.
Is this documented anywhere?
One problem I see with your goal is that most packages are in active development and they have new releases every few weeks. If you want to provide in the system depot only one version for each package, the odds that users will use precisely that one are very low.
Thanks all! I think I understand
JULIA_LOAD_PATH now. From the docs, I see that setting
JULIA_LOAD_PATH will prepend
JULIA_LOAD_PATH to the default load path.
That’s not exactly what I want, since I’d like the packages in
~/.julia to have higher precedence than the site packages (in case the users want a different configuration than the site packages). But I can mess around with it.
Yes, that about sums up my life as an admin. The best solution IMO would be for
~/.julia to take precedence over the directory for my site package. Looks like simply setting
JULIA_DEPOT_PATH to my site directory will not establish my desired precedence, though.
Again, disclaimer: I am only a user in a cluster.
At NERSC I regularly don’t use the module so I use only my own installed packages. However, the time I used the module it works like this.
If I only do
using MPI it will load the systemwide module.
If I dare to do
]add MPI and then
using MPI, my locally installed package will have precedence.
If you are not interested in the systemwide modules you just can avoid loading the module julia.
Seems kind of what you want to do? But It is not ideal, to be honest. So far using the arctifacts have worked relatively well for me but I have not tested if using the systemwide installs of MPI, NetCDF etc would have better performance because those are linked to the vendor-optimized libraries.
So far using the arctifacts have worked relatively well for me but I have not tested if using the systemwide installs of MPI, NetCDF etc would have better performance because those are linked to the vendor-optimized libraries.
I think my primary concern is the burden on the users to configure it properly. It’s good to know that
]add MPI works well for you.
I’m a little concerned with CUDA.jl, though. A manual installation could end up being a bad experience for the users: I would need to make sure that the users load the CUDA toolkit module correctly before they run
]add CUDA, and the CUDA toolkit is not available on every node, and lots more nonsense. So I think that is definitely a good candidate for a systemwide install with properly managed modules and packages.
Contrary to some of the languages traditionally used in research computing environments (I’m thinking of C/C++ and Fortran), Julia actually comes with a decent package manager, so using versions of packages different from those provided by the environment is even more common.
I don’t think you want to touch
JULIA_LOAD_PATH at all, but only the depot. For that,
should achieve the desired effect.
Would it be possible for the admin to provide a Julia system image with Cuda and MPI compiled, and, perhaps, block the installation of different versions of these packages?
Of course users can still have their local Julia and do what they want, but having a sys image with those packages included in a HPC environment sounds practical.
I looked at the docs again, and I think I finally understand the behavior of
JULIA_DEPOT_PATH is not defined. That does indeed seem to be what I want. Thanks again!
Great question. If possible, can you mark a post as a solution to help others down the line? Thanks.
Hopefully I add something to the debate here. We mention “traditional” HPC languages C/C++ and Fortran.
texas TACC have developed XALT which tracks the executables and libraries which are REALLY being used in an HPC environment. If I recall, this works by prepending a library path when a job is run. Please have a look at it.
Something tells me that it probably does not work very well with a Julia environment - though I could be wrong.
Just like Python script will run under
python, Julia scripts will run under
julia. So XALT will not tell you much about the actual application being run, except if you also look at the arguments passed.
In general, at our HPC site most users run the precompiled packages we provide them (e.g. GROMACS), so that doesn’t really relate to the language used, as users themselves don’t pick it. We currently only see a trickle of Julia users, but I intend to look closer at the XALT stats on that in the near future.