Dynamic cluster without head node?

Hi there,

I’m looking for a sort of dynamically sized cluster manager, but without the requirement for a dedicated head node that’s responsible for adding and removing worker nodes. I know of ElasticManager from ClusterManagers.jl, but that doesn’t quite fit the bill - the head node is still assumed to manage the dynamically coming and going workers.

I guess I’m looking for workers that self manage when other workers connect to them? In my specific use case, I can assume that I know the IPs of possible workers, so starting a worker and then trying to connect to “the mesh” by pinging each expected node about its own existence is a possibility.

Does anyone know if this sort of thing has been done before or if implementing this based on ElasticManager would be best?

1 Like

So today I found this, which made me think that this probably isn’t such a good idea after all. (The bug was fixed long ago, so don’t worry)

Maybe it’s better to keep a master node? I know there’s some work in Erlang and other distributed systems about voting for the master, maybe I’ll have to look into this path…

The saving of resources would definitely need to be more complicated than “just dump it to file lol”, unless there was a reliable NAS/NFS configured everywhere…