Parallel computing on remote workers


#1

I managed to add a remote worker via addprocs():

addprocs(["tgal@ecapmac"], tunnel=true, dir="/Users/tgal/tmp")

…but Julia simply hangs and shows the error below when I launch the REPL with a machine file containing the same hostname. It seems that it tries to switch to the location of the current local working directory on the remote machine and then it hangs forever.

Is there a way to set the directory for each machine differently or what am I doing wrong?

  tamasgal@greybox:~/tmp/juliaparallel
  00:40:58 > julia --machinefile machines
sh: line 0: cd: /Users/tamasgal/tmp/juliaparallel: No such file or directory

Nothing happens after that, I can only Ctrl+C and get this traceback:

^CERROR: InterruptException:
 in unsafe_convert(::Type{Ptr{Void}}, ::Ptr{Void}) at /Applications/Julia-0.5.app/Contents/Resources/julia/lib/julia/sys.dylib:?
 in exec(::Ptr{Void}, ::String, ::Int64, ::UInt32, ::Ptr{Void}) at ./pcre.jl:130
 in match(::Regex, ::String, ::Int64, ::UInt32) at ./regex.jl:161
 in parse_connection_info(::String) at ./multi.jl:1597
 in read_worker_host_port(::Pipe) at ./multi.jl:1589
 in connect(::Base.SSHManager, ::Int64, ::WorkerConfig) at ./managers.jl:385
 in create_worker(::Base.SSHManager, ::WorkerConfig) at ./multi.jl:1786
 in setup_launched_worker(::Base.SSHManager, ::WorkerConfig, ::Array{Int64,1}) at ./multi.jl:1733
 in (::Base.##649#653{Base.SSHManager,Array{Int64,1}})() at ./task.jl:360
 in sync_end() at ./task.jl:311
 in macro expansion at ./task.jl:327 [inlined]
 in #addprocs_locked#645(::Array{Any,1}, ::Function, ::Base.SSHManager) at ./multi.jl:1688
 in #addprocs_locked#645(::Array{Any,1}, ::Function, ::Base.SSHManager) at /Applications/Julia-0.5.app/Contents/Resources/julia/lib/julia/sys.dylib:?
 in (::Base.#kw##addprocs_locked)(::Array{Any,1}, ::Base.#addprocs_locked, ::Base.SSHManager) at ./<missing>:0
 in #addprocs#644(::Array{Any,1}, ::Function, ::Base.SSHManager) at ./multi.jl:1658
 in #addprocs#644(::Array{Any,1}, ::Function, ::Base.SSHManager) at /Applications/Julia-0.5.app/Contents/Resources/julia/lib/julia/sys.dylib:?
 in (::Base.#kw##addprocs)(::Array{Any,1}, ::Base.#addprocs, ::Base.SSHManager) at ./<missing>:0
 in #addprocs#744(::Bool, ::Cmd, ::Int64, ::Array{Any,1}, ::Function, ::Array{Any,1}) at ./managers.jl:112
 in #addprocs#744(::Bool, ::Cmd, ::Int64, ::Array{Any,1}, ::Function, ::Array{Any,1}) at /Applications/Julia-0.5.app/Contents/Resources/julia/lib/julia/sys.dylib:?
 in process_options(::Base.JLOptions) at ./client.jl:228
 in _start() at ./client.jl:318
 in _start() at /Applications/Julia-0.5.app/Contents/Resources/julia/lib/julia/sys.dylib:?
UndefRefError()