Using Distributed.jl, is there no built-in way to find out which workers are on the same shared-memory node? This post:
shows a manual implementation based on gethostname(), but I don’t know how robust that is? If there’s nothing built-in, maybe there’s already a package that helps with this?
My goal is to create one SharedArray per cluster node, which is accessible to all workers on that node, so I need to find out which machine each of the workers is running on.
Well, since Distributed.map_pid_wrkr is not documented or exported, I’d be hesitant to rely on it. And it seems like the suggestion pretty much the equivalent of pmap(_ -> gethostname(), procs())?
I just noticed that Distributed does at least export check_same_host(pids) (no documentation, though), which also seems to be used by SharedArrays, and while that does allow one to check if the given processes are on the same node, it’d be pretty cumbersome and inefficient to use that to partition the processes by their nodes.