Using Julia on a cluster

DrChainsaw · May 9, 2022, 6:02pm

If you post the output of the script it will be much easier for people to help you. Chances are that whatever problem you see doesn’t have anything to do with parallelism.

Does it work if you remove the @threads macro?

About distributed computing: It should be possible to run the task you describe on a cluster. How easy depends a little bit on the type of cluster. Easiest is probably to use the stdlib Distributed and search online for a package which can plug in your type of cluster to Distributed.

You might want to start by spinning up processes locally on the host just to test the plumbing. When running distributed, I have found that errors thrown in the user code tend end up in some cluster-logfile and all the user sees from the host process is some “worker not responding” error, so make sure the code works locally first.

A common pitfall when using process parallelism is to forget that the workers are independent OS processes which don’t share memory. This is a big difference compared to when using threads and has alot of implications which might be obvious in hindsight but are easy to overlook and lead to frustrating errors. I have found that trying to think of each worker as being pretty much the same thing as starting multiple instances of the REPL manually helps alot.

For example, the code above in your example would most likely fail because the variables and functions in the loop do not exist on the workers. Even if they did, they would be completely independent, so setting the i:th column on a worker does absolutely nothing to the same variable on the other machines.

Here is a thread with some general tips for distributed computing which could be good to read through: The ultimate guide to distributed computing

Topic		Replies	Views
Parallelizing a for-loop with a matrix Performance	2	340	June 2, 2021
How to query Slurm Array Task ID in Julia? Julia at Scale question	15	2875	November 19, 2020
Correct way of parallelizing on a HPC remote cluster machine Performance question , hpc , parallel , distributed , threads	8	1282	August 25, 2020
Some introductory Julia codes for HPC! Julia at Scale tutorials	0	149	October 16, 2024
Questions about getting started with parallel computing Julia at Scale	18	5778	June 22, 2019

Using Julia on a cluster

Related topics