I am learning how to build up a computer cluster for computation. Now I have already started up the SSH service on one server running windows 10 and keep the same julia version of the latest 1.9.3 installed in the same path, and I can successfully ssh to it via my laptop running windows 11. However, when I call addprocs([“user@remote_IP_address”]), it is thrown that
ERROR: TaskFailedException
nested task error: Unable to read host:port string from worker. Launch command exited with error?
What were missed during the above settings to result in the failure? Or preferably, could you please share your experience of a successful setting?
johnh
October 15, 2023, 8:06am
2
I would ask you to think about using a cloud service using Linux
AWS Parallel Cluster AWS ParallelCluster - Amazon Web Services
Azure High-performance computing (HPC) on Azure - Azure Architecture Center | Microsoft Learn
I have been building HPC clusters for over 20 years - go with the flow
Really cool! but it actually doesn’t resolve my problem
1 Like
johnh
October 15, 2023, 5:26pm
4
If you use the ssh server in Mobaxterm does this help?
https://mobaxterm.mobatek.net/features.html
sob
August 15, 2025, 9:50pm
5
@lionisxn Did you ever solve this?
I have a Linux host that can do key based SSH login into a Windows host and start julia.exe
However, on the Linux machine:
julia> addprocs(["sob@win10-work"],shell=:wincmd)
The syntax of the command is incorrect.
ERROR: TaskFailedException
nested task error: Unable to read host:port string from worker. Launch command exited with error?
Stacktrace:
[1] worker_from_id(pg::Distributed.ProcessGroup, i::Int64)
@ Distributed ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/cluster.jl:1093
[2] worker_from_id
@ ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/cluster.jl:1090 [inlined]
[3] remote_do
@ ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/remotecall.jl:557 [inlined]
[4] kill(manager::Distributed.SSHManager, pid::Int64, config::WorkerConfig)
@ Distributed ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/managers.jl:736
[5] create_worker(manager::Distributed.SSHManager, wconfig::WorkerConfig)
@ Distributed ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/cluster.jl:604
[6] setup_launched_worker(manager::Distributed.SSHManager, wconfig::WorkerConfig, launched_q::Vector{Int64})
@ Distributed ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/cluster.jl:545
[7] (::Distributed.var"#45#48"{Distributed.SSHManager, Vector{Int64}, WorkerConfig})()
@ Distributed ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/cluster.jl:501
caused by: Unable to read host:port string from worker. Launch command exited with error?
Stacktrace:
[1] read_worker_host_port(io::Base.PipeEndpoint)
@ Distributed ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/cluster.jl:330
[2] connect(manager::Distributed.SSHManager, pid::Int64, config::WorkerConfig)
@ Distributed ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/managers.jl:580
[3] create_worker(manager::Distributed.SSHManager, wconfig::WorkerConfig)
@ Distributed ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/cluster.jl:600
[4] setup_launched_worker(manager::Distributed.SSHManager, wconfig::WorkerConfig, launched_q::Vector{Int64})
@ Distributed ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/cluster.jl:545
[5] (::Distributed.var"#45#48"{Distributed.SSHManager, Vector{Int64}, WorkerConfig})()
@ Distributed ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/cluster.jl:501
Stacktrace:
[1] sync_end(c::Channel{Any})
@ Base ./task.jl:466
[2] macro expansion
@ ./task.jl:499 [inlined]
[3] addprocs_locked(manager::Distributed.SSHManager; kwargs::@Kwargs{shell::Symbol})
@ Distributed ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/cluster.jl:490
[4] addprocs_locked
@ ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/cluster.jl:456 [inlined]
[5] addprocs(manager::Distributed.SSHManager; kwargs::@Kwargs{shell::Symbol})
@ Distributed ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/cluster.jl:450
[6] addprocs
@ ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/cluster.jl:443 [inlined]
[7] #addprocs#255
@ ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/managers.jl:159 [inlined]
[8] top-level scope
@ REPL[16]:1
julia>
I have tried to set exename
in an explicit manner, but that doesn’t change the outcome. I should also note that I’m able to create a worker on a Linux remote host without issue (also with key based SSH auth).
sob
August 17, 2025, 1:53am
6
It turns out I only needed to add dir=nothing
for the above addprocs()
call to work. Now I successfully get
julia> @show remotecall_fetch(Sys.windows_version,26) remotecall_fetch(Sys.windows_version, 26) = v"10.0.19045" v"10.0.19045"