Tset error while trying loading machinefile using multiple machines

question

#1

I can ssh login without password, but still get some errors.
My machinefile format is “host”, one host for each line.
I started julia using
julia --machinefile machinefile
but get some tset errors. Any ideas?

tset: standard error: Invalid argument

tset: standard error: Invalid argument

tset: standard error: Invalid argument

tset: standard error: Invalid argument

tset: standard error: Invalid argument

tset: standard error: Invalid argument

tset: standard error: Invalid argument

tset: standard error: Invalid argument

Master process (id 1) could not connect within 60.0 seconds.
exiting.
Master process (id 1) could not connect within 60.0 seconds.
exiting.
Worker 8 terminated.
Master process (id 1) could not connect within 60.0 seconds.
exiting.
Master process (id 1) could not connect within 60.0 seconds.
exiting.
Master process (id 1) could not connect within 60.0 seconds.
exiting.
Master process (id 1) could not connect within 60.0 seconds.
exiting.
Master process (id 1) could not connect within 60.0 seconds.
exiting.
ERROR (unhandled task failure): Version read failed. Connection closed by peer.
Stacktrace:
[1] process_hdr(::TCPSocket, ::Bool) at ./distributed/process_messages.jl:257
[2] message_handler_loop(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_message
s.jl:143
[3] process_tcp_streams(::TCPSocket, ::TCPSocket, ::Bool) at ./distributed/process_messages
.jl:118
[4] (::Base.Distributed.##99#100{TCPSocket,TCPSocket,Bool})() at ./event.jl:73
Master process (id 1) could not connect within 60.0 seconds.
exiting.


#2

Can you try using ClusterManagers.jl instead and see if that works?


#3

ClusterManagers have some special ClusterManager which I don’t have. What I have is passwd less ssh login. The SSHManager was provided in Distributed.

I just tried using Distributed and use:

machines = [("machine01", :auto), ("machine02", :auto)] 
addprocs(machines; topology=:master_slave)

still get the same tset error.