I’m an administrator of a shared compute cluster, and I’d like to provide a few packages in a sort of “base environment” for all users on the system. What’s the recommended workflow for this?
Specifically, I’d like to have MPI.jl and CUDA.jl pre-installed for all users, so I can make sure they’re configured correctly for our system.
Many thanks,
Ron
Ron Rahaman
Research Scientist II
Partnership for an Advanced Computing Environment (PACE)
Georgia Institute of Technology
At NERSC they have JULIA_DEPOT_PATH and JULIA_LOAD_PATH pointing to a place where they installed the packages. When you do module load julia these environment variables get defined, so when you use Julia you have access to those packages. They do it especifically for things like MPI.jl
Unfortunately I am not familiar with the specifics.
One problem I see with your goal is that most packages are in active development and they have new releases every few weeks. If you want to provide in the system depot only one version for each package, the odds that users will use precisely that one are very low.
Thanks all! I think I understand JULIA_LOAD_PATH now. From the docs, I see that setting JULIA_LOAD_PATH will prepend JULIA_LOAD_PATH to the default load path.
That’s not exactly what I want, since I’d like the packages in ~/.julia to have higher precedence than the site packages (in case the users want a different configuration than the site packages). But I can mess around with it.
Yes, that about sums up my life as an admin. The best solution IMO would be for ~/.julia to take precedence over the directory for my site package. Looks like simply setting JULIA_LOAD_PATH and JULIA_DEPOT_PATH to my site directory will not establish my desired precedence, though.
At NERSC I regularly don’t use the module so I use only my own installed packages. However, the time I used the module it works like this.
If I only do using MPI it will load the systemwide module.
If I dare to do ]add MPI and then using MPI, my locally installed package will have precedence.
If you are not interested in the systemwide modules you just can avoid loading the module julia.
Seems kind of what you want to do? But It is not ideal, to be honest. So far using the arctifacts have worked relatively well for me but I have not tested if using the systemwide installs of MPI, NetCDF etc would have better performance because those are linked to the vendor-optimized libraries.
So far using the arctifacts have worked relatively well for me but I have not tested if using the systemwide installs of MPI, NetCDF etc would have better performance because those are linked to the vendor-optimized libraries.
I think my primary concern is the burden on the users to configure it properly. It’s good to know that ]add MPI works well for you.
I’m a little concerned with CUDA.jl, though. A manual installation could end up being a bad experience for the users: I would need to make sure that the users load the CUDA toolkit module correctly before they run ]add CUDA, and the CUDA toolkit is not available on every node, and lots more nonsense. So I think that is definitely a good candidate for a systemwide install with properly managed modules and packages.
Contrary to some of the languages traditionally used in research computing environments (I’m thinking of C/C++ and Fortran), Julia actually comes with a decent package manager, so using versions of packages different from those provided by the environment is even more common.
I don’t think you want to touch JULIA_LOAD_PATH at all, but only the depot. For that,
Would it be possible for the admin to provide a Julia system image with Cuda and MPI compiled, and, perhaps, block the installation of different versions of these packages?
Of course users can still have their local Julia and do what they want, but having a sys image with those packages included in a HPC environment sounds practical.
I looked at the docs again, and I think I finally understand the behavior of JULIA_DEPOT_PATH="${JULIA_DEPOT_PATH}:/foo/bar" when JULIA_DEPOT_PATH is not defined. That does indeed seem to be what I want. Thanks again!
Hopefully I add something to the debate here. We mention “traditional” HPC languages C/C++ and Fortran.
texas TACC have developed XALT which tracks the executables and libraries which are REALLY being used in an HPC environment. If I recall, this works by prepending a library path when a job is run. Please have a look at it.
Something tells me that it probably does not work very well with a Julia environment - though I could be wrong.
Just like Python script will run under python, Julia scripts will run under julia. So XALT will not tell you much about the actual application being run, except if you also look at the arguments passed.
In general, at our HPC site most users run the precompiled packages we provide them (e.g. GROMACS), so that doesn’t really relate to the language used, as users themselves don’t pick it. We currently only see a trickle of Julia users, but I intend to look closer at the XALT stats on that in the near future.