Trouble with addprocs

I’m having trouble with addprocs. I have set up passwordless login via a key (though the key still asks for my password, in case that could be the issue, though I doubt it).

I also have looked at other posts here but they seem to be running in different problems, or are unsolved.

I am using the following call

julia> addprocs(["admin@some.server.com"],exename="julia",dir="/home/admin")
Enter passphrase for key '/c/Users/Jeremy/.ssh/id_rsa':
ERROR: connect: connection timed out (ETIMEDOUT)
try_yieldto(::Base.##296#297{Task}, ::Task) at .\event.jl:189
...

The connection is clearly working, since if I change the exename, I get an error from the server telling me the executable does not exist:

julia> addprocs(["admin@some.server.com"],exename="notjulia",dir="/home/admin")
Enter passphrase for key '/c/Users/Jeremy/.ssh/id_rsa':
bash: notjulia: command not found
ERROR: Unable to read host:port string from worker. Launch command exited with error?
read_worker_host_port(::Pipe) at .\distributed\cluster.jl:236
...

Finally, trying to use SSH tunnels:

julia> addprocs(["admin@some.server.com"],exename="julia",dir="/home/admin",tunnel=true)
Enter passphrase for key '/c/Users/Jeremy/.ssh/id_rsa':
ERROR: unable to create SSH tunnel after 100 tries. No free port?
ssh_tunnel(::SubString{String}, ::SubString{String}, ::SubString{String}, ::UInt16, ::Cmd) at .\distributed\managers.jl:278

I am guessing that these errors are due to the server’s firewall. I therefore have two questions: How do we know which ports to open for 1., the regular (non-SSH) workers, and 2., the SSH tunnel workers?

Thanks,
Jeremy

From what you write here the problem really seems to be the password on the ssh key. This won’t work - password less is a requirement. Just erase that key (both public and private part) - regenerate without passphrase and start from there.

Despite the ssh connection being clearly established properly?

well the connection only works if you supply a password. julia won’t supply your password, so you won’t have a connection. that second error message just shows that the master couldn’t read back from the worker, it does not indicate that you established a connection.

I actually can supply the password when that prompt appears.

I also get an error message if I don’t give the right dir. I am therefore fairly certain the SSH connection works fine.

Ok i see. Can you ssh from the worker back to the master? In general you need password less ssh that works both ways.

From what I understand, in non-tunnel mode, the master-worker connection does not use SSH, so that should not be the issue.

In tunnel mode, I am not sure how the SSH connection is established, but the docs don’t mention anything about needing a passwordless login from the worker. If that is needed then it obviously wouldn’t work in my current configuration.

It turns out that the password was the issue for the tunnel workers. I used SSH agent and that fixed it.

It didn’t fix the non-tunnel workers, so that’s still probably a firewall issue, but it’s OK since I’d rather not have unsecured communication.

The non-tunneled workers might be https://github.com/JuliaLang/julia/pull/25126.

The workers need to open a port, but if you have a firewall on the server, how do you know which ports to allow?

I guess that pull request gives a range, but I’m not sure how safe it is to leave ports open like this on a public server…