I am using the
ClusterManagers.ElasticManager to handle up to hundreds of julia processes in my cluster. Say I do:
using Distributed, ClusterManagers c = ClusterManagers.ElasticManager(myip, port, "foobar")
and then start up several processes through our queueing system, which would start up julia processes with
submit2queue echo "using ClusterManagers; ClusterManagers.elastic_worker(\"foobar\")" | julia
Julia processes will start up over some period of time as resources become available. Once I have a considerable amount of workers I may start some computation:
@everywhere f(x) = (sleep(1); 2 * x) results = pmap(f, 1:100)
This all works nice until a “late” worker is registered in the
ElasticManager while the
pmap is doing its job. Julia is smart and tries utilize also these workers. However they do not know what
f(x) is, because they missed the
@everywhere ... and therefore throw an error.
How can I initialize such late workers?