Ask for official guide for Julia installation, deployment and use on Linux cluster

WuSiren · January 30, 2024, 4:21pm

That’s great! Look forward to seeing it born!

ffevotte · January 30, 2024, 4:40pm

Last time I tried, I was able to avoid this issue by running a very small computation sequentially on one node, before running the full parallel computation. The rationale was that if precompilation is performed sequentially by a single node, then there wouldn’t be conflicts between nodes at the beginning of the parallel run. Not sure this is guaranteed to work, but maybe worth trying?

juliohm · January 30, 2024, 6:15pm

@greatpet consider adding those tips to our guide, they are very helpful:

tamasgal · January 31, 2024, 10:21am

Yes, that’s something which worked here and there but I guess it boils down to how many invalidations are caused during the processing. Some jobs might trigger recompiliation… That being said, this workaround is not consistent enough, unfortunately.

WuSiren · February 1, 2024, 7:20am

Writing this line in the submission script doesn’t seem to have priority over that in the file ~/.bashrc. I submitted this SLURM script when there was an environment variable written in the ~/.bashrc file pointing to another version of Julia and it turned out that the Julia actually executing the code is not the julia-1.9.3 I specified in the SLURM script.

Do I also have to make sure that no other version of Julia is specified in the ~/.bashrc file before I submit such a SLURM script?

ffevotte · February 1, 2024, 1:54pm

Yes, the first entries in the PATH take precedence over the following ones. If you prepend the julia path before the existing contents of PATH, then the version specified in the SLURM submission script will take precedence:

export PATH=/path/to/julia-1.9.3/bin:$PATH

WuSiren · February 1, 2024, 3:24pm

Ah, that makes sense. Thanks a lot!

yvikhlya · February 1, 2024, 4:02pm

The way I use Julia on SLURM cluster is I manage installation with juliaup, then submit an interactive job like this:

$ salloc --x11 --time=1:00:00 --nodes=1 --tasks-per-node=126 --constrain=mil --qos=debug

This imports all environment from the calling shell on my cluster, so I don’t need to set up PATH etc. Then I call Julia normal way:

$ jullia
> using Distributed
> addprocs(126)
> @everywhere using MyPackage
> @everywhere include("my_script.jl")

This works. Submitting a job with sbatch also works. Julia compiler knows when it needs to recompile packages (usually when I switch between Milan an Cascade Lake nodes, besides different architecture, they run different OS too). Compilation takes some time, so I try to run my jobs on same type of nodes.

P.S. If your environment variables are not propagated from login shell into SLURM session, I believe sbatch --export=ALL ... or setting SLURM_EXPORT_ENV variable may be usefull.

WuSiren · February 1, 2024, 4:44pm

Thanks! I’m being curious about the interactive work mode of SLURM but not quite familiar with that for now.

johnh · February 1, 2024, 5:06pm

@WuSiren For a simple interactive job

sbatch --nodes=1 --ntasks-per-node=1 -i - -pty /bin/bash

greatpet · September 9, 2024, 7:25pm

Now I recommend this guide:
https://juliahpc.github.io/
(In particular, item (4) in my old long answer in this thread is not considered ideal.)

Topic		Replies	Views
Help setting up Julia on a cluster Julia at Scale question , parallel , cluster	28	14959	March 4, 2020
How to run Julia on Cluster? Julia at Scale question , package , cluster	11	5468	March 16, 2021
Julia crashes when started on the nodes of a HPC cluster General Usage question , hpc , debug , cluster	8	2189	January 3, 2018
Is ClusterManagers.jl maintained? Or, how to do multi-node calculations in Julia? General Usage question , package	44	2138	July 13, 2024
Distributed computing over SLURM array Performance slurm	11	286	September 20, 2024

Ask for official guide for Julia installation, deployment and use on Linux cluster

Related topics