Error when using @distributed for on cluster with multiple nodes

Pbellive · August 31, 2018, 8:23pm

Oh, yes, if you’re using a job management system you’ll have to manage things a bit differently. I’m not familiar with PBS. You might want to checkout the thread:

and also the ClusterManagers package.

I can say that just launching julia via julia -p n for some number n. is meant for launching multiple workers on a single machine. To launch workers on multiple machines you need to launch julia with a machine file. This post has an example of how to do that with PBS. That’s about all I know. If that doesn’t get you going I would look around for more resources on/ask for help with getting Julia working with cluster job schedulers.

Cheers, Patrick

Topic		Replies	Views
Trying to launch Julia cluster under PBS General Usage question	8	904	October 2, 2020
Running Julia with multiple workers via ssh tunnel General Usage distributed	10	1841	March 18, 2019
I am unable to run a simple distributed.jl code on my slurm cluster Julia at Scale parallel , distributed , slurm	11	644	February 10, 2024
Getting started with HPC and Julia General Usage distributed	23	1046	September 28, 2023
Addprocs() on remote machines failing Julia at Scale	6	1131	December 9, 2019

Error when using @distributed for on cluster with multiple nodes

Related topics