In my SSH config file, I have
Host a
User me
HostName a.example.com
Host b
User me
HostName b.example.com
ProxyJump a
I want to add b
as a worker process from my local machine using addprocs
.
I’m able to add a
as a worker just fine with addprocs(["a"])
, but addprocs(["b"])
hangs for a while then times out:
ERROR: IOError: connect: connection timed out (ETIMEDOUT)
try_yieldto(::typeof(Base.ensure_rescheduled), ::Base.RefValue{Task}) at ./event.jl:196
wait() at ./event.jl:255
wait(::Condition) at ./event.jl:46
stream_wait(::Sockets.TCPSocket, ::Condition) at ./stream.jl:47
wait_connected(::Sockets.TCPSocket) at ./stream.jl:330
connect at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Sockets/src/Sockets.jl:456 [inlined]
connect_to_worker(::SubString{String}, ::UInt16) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Distributed/src/managers.jl:499
connect(::Distributed.SSHManager, ::Int64, ::WorkerConfig) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Distributed/src/managers.jl:437
create_worker(::Distributed.SSHManager, ::WorkerConfig) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Distributed/src/cluster.jl:501
setup_launched_worker(::Distributed.SSHManager, ::WorkerConfig, ::Array{Int64,1}) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Distributed/src/cluster.jl:447
(::getfield(Distributed, Symbol("##47#50")){Distributed.SSHManager,WorkerConfig})() at ./task.jl:259
Stacktrace:
[1] sync_end(::Array{Any,1}) at ./task.jl:226
[2] macro expansion at ./task.jl:245 [inlined]
[3] #addprocs_locked#44(::Base.Iterators.Pairs{Symbol,Any,NTuple{5,Symbol},NamedTuple{(:tunnel, :sshflags, :max_parallel, :dir, :exename),Tuple{Bool,Cmd,Int64,String,String}}}, ::Function, ::Distributed.SSHManager) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Distributed/src/cluster.jl:401
[4] #addprocs_locked at ./none:0 [inlined]
[5] #addprocs#43(::Base.Iterators.Pairs{Symbol,Any,NTuple{5,Symbol},NamedTuple{(:tunnel, :sshflags, :max_parallel, :dir, :exename),Tuple{Bool,Cmd,Int64,String,String}}}, ::Function, ::Distributed.SSHManager) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Distributed/src/cluster.jl:365
[6] #addprocs at ./none:0 [inlined]
[7] #addprocs#249 at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.1/Distributed/src/managers.jl:118 [inlined]
[8] (::getfield(Distributed, Symbol("#kw##addprocs")))(::NamedTuple{(:dir, :exename),Tuple{String,String}}, ::typeof(addprocs), ::Array{String,1}) at ./none:0
[9] top-level scope at none:0
Some things to note:
- My RSA key is password protected. I’ve run
ssh-add
on my local machine though. - The path to the Julia executable differs between my local machine,
a
, andb
. (I’m using theexename
keyword argument inaddprocs
and setting it to the path ona
.) - I’ve tried with and without
tunnel=true
.
Is there something fundamental I’m missing here? My knowledge of SSH in general, as well as Distributed and SSHManager
, is lacking.