Is ClusterManagers.jl maintained? Or, how to do multi-node calculations in Julia?

And it definitely wasn’t because we were shamed into it by this thread… :flushed:

2 Likes

RE: Testing this package.

Is it possible to have a “virtual Slurm cluster” that actually just runs on one machine in a Docker Image or something? I assume that would be very helpful for testing.

EDIT: A quick search reveals this: GitHub - giovtorres/slurm-docker-cluster: A Slurm cluster using docker-compose

Would this be a sensible direction to move to for testing?

Here is the relevant issue (note that Valentin references that same repo in the first comment) and a WIP PR that implements it or something like it.

So yes, it seems possible. I don’t know why exactly the PR stalled, it looks like someone needs to be bumped to add permissions?

1 Like

I remember looking into the dask-jobqueue test suite I added on that PR and it didn’t look too bad to set up. It covers many cluster systems and can simulate them. I just haven’t had an interval of time long enough to get it working. Help would be much appreciated if someone is interested. Either fork that PR or I can give you push access to my fork.

It’s been a while since that PR was started so might be worth pulling in changes from the latest dask-jobqueue in case they have added anything.

1 Like

ParallelProcessingTools can dynamically add local, SLURM and HTCondor workers pretty well now. It uses a forked version of ClusterManagers.ElasticManager internally (that will be upstreamed back to ClusterManagers after a few more changes).

8 Likes