Now I get that the path is full of good stuff, not optimal but I thought it worth a try, and I still get that find_library fails to confirm that libmpi is in the path.
Sorry I cannot contribute here - and I really should be able to.
Just commenting that it looks like you are using EESSI which uses CERNVMFS and Eaybuild. Fantastic!
Certainly. Copied below. I guess the good news is that it seems to have libmpi in the path in that I get exactly the same output as when I type out the exact location of libmpi.so
julia> dlopen("libmpi")
ERROR: could not load library "libmpi"
libevent-2.1.so.6: cannot open shared object file: No such file or directory
Stacktrace:
[1] dlopen(s::String, flags::UInt32; throw_error::Bool)
@ Base.Libc.Libdl ./libdl.jl:114
[2] dlopen (repeats 2 times)
@ ./libdl.jl:114 [inlined]
[3] top-level scope
@ REPL[2]:1
when I try dlopen("libevent-2.1") I get a message saying cannto open shared object file. If I instead type in the whole path I get something a bit more interesting.
julia> dlopen("/cvmfs/soft.computecanada.ca/gentoo/2020/usr/lib64/libevent-2.1.so.6")
ERROR: could not load library "/cvmfs/soft.computecanada.ca/gentoo/2020/usr/lib64/libevent-2.1.so.6"
libcrypto.so.1.1: cannot open shared object file: No such file or directory
When I try opening the libcrypto library I then get a problem with libc.
julia> dlopen("/cvmfs/soft.computecanada.ca/gentoo/2020/usr/lib64/libcrypto.so.1.1")
ERROR: could not load library "/cvmfs/soft.computecanada.ca/gentoo/2020/usr/lib64/libcrypto.so.1.1"
/lib64/libc.so.6: version `GLIBC_2.25' not found (required by /cvmfs/soft.computecanada.ca/gentoo/2020/usr/lib64/libcrypto.so.1.1)
Finally when I try opening libc then I get a segmentation fault. I wonder if this might be the source of the problem?
julia> dlopen("/lib64/libc.so.6")
signal (11): Segmentation fault
in expression starting at REPL[8]:1
_dl_relocate_object at /lib64/ld-linux-x86-64.so.2 (unknown line)
dl_open_worker at /lib64/ld-linux-x86-64.so.2 (unknown line)
_dl_catch_error at /lib64/ld-linux-x86-64.so.2 (unknown line)
_dl_open at /lib64/ld-linux-x86-64.so.2 (unknown line)
dlopen_doit at /lib64/libdl.so.2 (unknown line)
_dl_catch_error at /lib64/ld-linux-x86-64.so.2 (unknown line)
_dlerror_run at /lib64/libdl.so.2 (unknown line)
dlopen at /lib64/libdl.so.2 (unknown line)
jl_load_dynamic_library at /buildworker/worker/package_linux64/build/src/dlload.c:257
#dlopen#3 at ./libdl.jl:114
dlopen at ./libdl.jl:114 [inlined]
dlopen at ./libdl.jl:114
jfptr_dlopen_52107.clone_1 at /home/fpoulin/software/julia-1.6.1/lib/julia/sys.so (unknown line)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1703 [inlined]
do_call at /buildworker/worker/package_linux64/build/src/interpreter.c:115
eval_value at /buildworker/worker/package_linux64/build/src/interpreter.c:204
eval_stmt_value at /buildworker/worker/package_linux64/build/src/interpreter.c:155 [inlined]
eval_body at /buildworker/worker/package_linux64/build/src/interpreter.c:562
jl_interpret_toplevel_thunk at /buildworker/worker/package_linux64/build/src/interpreter.c:670
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:877
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:825
jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/toplevel.c:929
eval at ./boot.jl:360 [inlined]
eval_user_input at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:139
repl_backend_loop at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:200
start_repl_backend at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:185
#run_repl#42 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:317
run_repl at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:305
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
#874 at ./client.jl:387
jfptr_YY.874_41532.clone_1 at /home/fpoulin/software/julia-1.6.1/lib/julia/sys.so (unknown line)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1703 [inlined]
jl_f__call_latest at /buildworker/worker/package_linux64/build/src/builtins.c:714
#invokelatest#2 at ./essentials.jl:708 [inlined]
invokelatest at ./essentials.jl:706 [inlined]
run_main_repl at ./client.jl:372
exec_options at ./client.jl:302
_start at ./client.jl:485
jfptr__start_34289.clone_1 at /home/fpoulin/software/julia-1.6.1/lib/julia/sys.so (unknown line)
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1703 [inlined]
true_main at /buildworker/worker/package_linux64/build/src/jlapi.c:560
repl_entrypoint at /buildworker/worker/package_linux64/build/src/jlapi.c:702
main at julia (unknown line)
__libc_start_main at /lib64/libc.so.6 (unknown line)
unknown function (ip: 0x4007d8)
Allocations: 2651 (Pool: 2640; Big: 11); GC: 0
Segmentation fault
Thanks for sharing this, I did not know where easybuild comes from but it is certainly being used heavily on many big servers as part of compute canada.
Actually, there is a build in version of julia 1.6 on the server that I was using before. Unfortunately, when I tried to use Plots.jl it failed to produce an mp4 file. This doesn’t happen with the binary verison of julia 1.6, so I presume that’s a bug in the easybuild version. Do you think this is something I should mention somewhere and if yes, where exactly?
Easybuild exists to make the process of maintaining software on HPC systems - easy!
As you have seen there are several varieties of MPI , compilers, maths libraries etc. etc. on any HPC system. Easybuild have the concept of ‘toolchains’ such that applications can be built and maintained with given combinations of the basic tools - for example an Intel compiler version versus a gnu compiler version.
Also on HPC systems you will have software packages which are optimised for the particular CPU architecture you run on, not just the generic builds.
Unfortunately, I have not had a successful build on the server. I’m asking for some technical support and if others manage to do it, I will certainly share waht I learn. Sorry that I could not be of more help.
fileX.so: cannot open shared object file: No such file or directory
are misleading, because it isn’t fileX.so that can’t be found, but some of its dependencies. All kernels are very bad because with this message they don’t tell you what can’t be found. On systems like Linux you have to use strace to find out what file is being searched but can’t be found. Nothing Julia can’t do about, operating systems are just unhelpful.