Can't load stdlib package on remote machine

Hello,

I am trying to create a ClusterManager for Julia using blaunch with LSF.

To test running Julia on another machine, I am using:

blaunch remote.host.name julia test.jl

where remote.host.name is the host name of the remote machine and test.jl is:

println("hello")

However, when I run this I get the following output:

fatal: error thrown and no exception handler available.
InitError(mod=:Base, error=ArgumentError(msg="Package Sockets not found in current path:
- Run `Pkg.add("Sockets")` to install the Sockets package.
"))
rec_backtrace at /buildworker/worker/package_linux64/build/src/stackwalk.c:94
record_backtrace at /buildworker/worker/package_linux64/build/src/task.c:246
jl_throw at /buildworker/worker/package_linux64/build/src/task.c:577
require at ./loading.jl:817
init_stdio at ./stream.jl:237
jfptr_init_stdio_4446.clone_1 at /julia/lib/julia/sys.so (unknown line)
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2182
reinit_stdio at ./libuv.jl:120
__init__ at ./sysimg.jl:470
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2182
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1536 [inlined]
jl_module_run_initializer at /buildworker/worker/package_linux64/build/src/toplevel.c:90
_julia_init at /buildworker/worker/package_linux64/build/src/init.c:811
julia_init__threading at /buildworker/worker/package_linux64/build/src/task.c:302
main at /buildworker/worker/package_linux64/build/ui/repl.c:227
__libc_start_main at /lib64/libc.so.6 (unknown line)
_start at /julia/bin/julia (unknown line)

Are there any requirements for how Julia is launched? IBM say blaunch is a drop in replacement for ssh, so I’m not sure why it doesn’t work.

I have tried running the command with ssh:

ssh remote.host.name julia test.jl

which works as expected.

That is weird.

What does blaunch julia -e "using Pkg; Pkg.status()" yield? I looks like it is picking up a different environment between blaunch and ssh

I get the same error, it looks like it’s failing before any of the code I supply is run.

But I’ve tracked down the difference between executing Julia via ssh and blaunch.

ssh uses UV_NAMED_PIPE in __init__() while blaunch uses UV_TCP.

With UV_TCP, the code tries to load the Sockets module, but the LOAD_PATH hasn’t been set yet.

I think this is a bug, so I’ll post on the issue tracker.

1 Like

have you seen https://github.com/JuliaParallel/ClusterManagers.jl/pull/74 ? i just started on trying to get it to work under 0.7…