About implementing a shared Julia environment for HPC clusters at CSC

Hello! I’m working at CSC, an HPC center in Finland, and I have been installing a shared Julia environment to our HPC clusters (Puhti, Mahti, and LUMI). These clusters use Lmod for module environments and Slurm as the workload manager. I used Ansible to install and configure the environments. You can find the source here: csc-env-julia.

The Julia environment I implemented consists of the following modules:

  • julia module to set paths to the official Julia binaries, and default values for Julia thread count (JULIA_NUM_THREADS), Julia CPU count (JULIA_CPU_THREADS), and linear algebra backend thread counts (OPENBLAS_NUM_THREADS and MKL_NUM_THREADS). The module version corresponds to the Julia version.

  • julia-mpi module to load global preferences for MPI.jl to use the system MPI installation. The module version corresponds to the MPI.jl version.

  • julia-cuda module to load global preferences for CUDA.jl to use the system CUDA installation. The module version corresponds to the CUDA.jl version.

  • julia-amdgpu module to load global preferences for AMDGPU.jl to use the system ROCm installation. The module version corresponds to the AMDGPU.jl version.

I imagine other clusters could use similar module structure and consistency across clusters could help Julia adoption and usage in HPC clusters.

I have also written documentation about how to use the Julia application and how to run Julia batch jobs on the clusters.

Also, big thanks to all who have contributed to the Julia on HPC clusters page. It has been a very useful resource, especially the advice about creating the global preferences for MPI and GPU packages.

Any feedback is appreciated!

10 Likes