I’m attempting to use Julia on a cluster (using PBS) that blocks ssh connections to anything except the head node. Based on the documentation, it seems that Julia requires passwordless ssh to start workers on cluster nodes:
The base Julia installation has in-built support for two types of clusters:
- A local cluster specified with the -p option as shown above.
- A cluster spanning machines using the --machinefile option. This uses a passwordless ssh login to start Julia worker processes (from the same path as the current host) on the specified machines.
I’ve tried the solutions presented on this thread, but the following errors occur: (1) ClusterManagers
hangs when calling addprocs_pbs()
or (2) I get a permissions error when the ssh connection is attempted.
Permission denied, please try again.
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
ERROR: Unable to read host:port string from worker. Launch command exited with error?
read_worker_host_port(::Pipe) at ./distributed/cluster.jl:236
connect(::Base.Distributed.SSHManager, ::Int64, ::WorkerConfig) at ./distributed/managers.jl:391
create_worker(::Base.Distributed.SSHManager, ::WorkerConfig) at ./distributed/cluster.jl:443
setup_launched_worker(::Base.Distributed.SSHManager, ::WorkerConfig, ::Array{Int64,1}) at ./distributed/cluster.jl:389
(::Base.Distributed.##33#36{Base.Distributed.SSHManager,WorkerConfig,Array{Int64,1}})() at ./task.jl:335
My SysAdmin seems unwilling to allow ssh connections to worker nodes. Is there another option for using Julia on the cluster that bypasses this problem?