segfault in Libdl.dlpath("libjulia")


#1
julia> Libdl.dlpath("libjulia")
Two passes with the same argument (-LowerSIMDLoop) attempted to be registered!

signal (11): Segmentation fault
while loading no file, in expression starting on line 0
unknown function (ip: 0x7f3482fc1e4f)
Allocations: 1007153 (Pool: 1006016; Big: 1137); GC: 0
Segmentation fault (core dumped)

I am on Ubuntu 16.04 with Julia 0.6.1.
Before I open an issue, can anyone reproduce this?
Or is this caused by my specific environment?


#2

libjulia is already loaded in the process (by necessity!) so ccall will work for any export without specifying the library/handle. To pick a random example:

stream.jl:        ccall(:jl_init_pipe, Cint, (Ptr{Void},Int32,Int32,Int32), read_end, 0, 1, readable_julia_only))

#3

About the segfault, is this a local build? I can’t reproduce it on Ubuntu 17.04 with the 0.6.1 generic 64-bit binary. dlopen("libjulia") should be a no-op for in-process libjulia (returns the existing handle).

There might be an issue if you try to dlopen a different libjulia than the one in the current process, for various reasons, but specifically the message you posted is because the dl-loader runs initializers which might conflict with existing singletons (mostly LLVM). You could try adding RTLD_LOCAL, but not sure if that is sufficient.


#4

The problem actually comes from JuliaCall, which embeds julia in R. If we want to embed julia in R, we still need to find out where the libjulia is, isn’t it? Is there another way to locate the julia lib other than using Libdl.dlpath("libjulia")?


#5

Makes sense.

@vhd is there anything Julia-related in the environment variables? If you are using Julia from a package manager, could you please try the generic binary as a differential step?


#6

I am using the official 64-bit build from julialang website.

I will setup a clean virtual machine installation and will try it again.


#7

I think not, printenv | grep julia -i shows nothing


#8

Could you try gdb julia and then post the output of bt (backtrace) after the segfault?


#9
(gdb) run
Starting program: /usr/local/bin/julia 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff4082700 (LWP 32343)]
[New Thread 0x7fffe8b7d700 (LWP 32344)]
[New Thread 0x7fffe637c700 (LWP 32345)]
[New Thread 0x7fffe5b7b700 (LWP 32346)]
            _
_       _ _(_)_     |  A fresh approach to technical computing
(_)     | (_) (_)    |  Documentation: https://docs.julialang.org
_ _   _| |_  __ _   |  Type "?help" for help.
| | | | | | |/ _` |  |
| | |_| | | | (_| |  |  Version 0.6.1 (2017-10-24 22:15 UTC)
_/ |\__'_|_|_|\__'_|  |  Official http://julialang.org/ release
|__/                   |  x86_64-pc-linux-gnu

julia> Libdl.dlopen("libjulia")
Two passes with the same argument (-LowerSIMDLoop) attempted to be registered!

Thread 1 "julia" received signal SIGSEGV, Segmentation fault.
0x00007ffff532de50 in llvm::PassNameParser::passEnumerate(llvm::PassInfo const*) () from /opt/julia/bin/../lib/julia/libLLVM-3.9.so

#10
#0  0x00007ffff532de50 in llvm::PassNameParser::passEnumerate(llvm::PassInfo const*) () from /opt/julia/bin/../lib/julia/libLLVM-3.9.so
#1  0x00007fffd8d10600 in ?? () from /opt/julia/bin/../lib/libjulia.so
#2  0x00000000006165e0 in ?? ()
#3  0x00007fffd8d10600 in ?? () from /opt/julia/bin/../lib/libjulia.so
#4  0x00000000006166a8 in ?? ()
#5  0x00000000006166b0 in ?? ()
#6  0x00007ffff5338832 in llvm::PassRegistry::registerPass(llvm::PassInfo const&, bool) () from /opt/julia/bin/../lib/julia/libLLVM-3.9.so
#7  0x00007fffd8649dda in llvm::RegisterPass<llvm::LowerSIMDLoop>::RegisterPass (is_analysis=false, CFGOnly=false, Name=0x7fffd877a2c8 "LowerSIMDLoop Pass", PassArg=0x7fffd877a2db "LowerSIMDLoop", 
    this=0x7fffd8d10600 <llvm::X>) at /buildworker/worker/package_linux64/build/usr/include/llvm/PassSupport.h:109
#8  __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at /buildworker/worker/package_linux64/build/src/llvm-simdloop.cpp:203
#9  _GLOBAL__sub_I_llvm_simdloop.cpp(void) () at /buildworker/worker/package_linux64/build/src/llvm-simdloop.cpp:210
#10 0x00007ffff7de76ba in call_init (l=<optimized out>, argc=argc@entry=1, argv=argv@entry=0x7fffffffdd28, env=env@entry=0x1953c50) at dl-init.c:72
#11 0x00007ffff7de77cb in call_init (env=0x1953c50, argv=0x7fffffffdd28, argc=1, l=<optimized out>) at dl-init.c:30
#12 _dl_init (main_map=main_map@entry=0x23607c0, argc=1, argv=0x7fffffffdd28, env=0x1953c50) at dl-init.c:120
#13 0x00007ffff7dec8e2 in dl_open_worker (a=a@entry=0x7fffffffc670) at dl-open.c:575
#14 0x00007ffff7de7564 in _dl_catch_error (objname=objname@entry=0x7fffffffc660, errstring=errstring@entry=0x7fffffffc668, mallocedp=mallocedp@entry=0x7fffffffc65f, 
    operate=operate@entry=0x7ffff7dec4d0 <dl_open_worker>, args=args@entry=0x7fffffffc670) at dl-error.c:187
#15 0x00007ffff7debda9 in _dl_open (file=0x7fffffffc9a0 "libjulia.so", mode=-2147483639, caller_dlopen=0x7ffff772993a <jl_load_dynamic_library_+602>, nsid=-2, argc=<optimized out>, argv=<optimized out>, env=0x1953c50)
    at dl-open.c:660
#16 0x00007ffff74d0f09 in dlopen_doit (a=a@entry=0x7fffffffc8a0) at dlopen.c:66
#17 0x00007ffff7de7564 in _dl_catch_error (objname=0x62a0b0, errstring=0x62a0b8, mallocedp=0x62a0a8, operate=0x7ffff74d0eb0 <dlopen_doit>, args=0x7fffffffc8a0) at dl-error.c:187
#18 0x00007ffff74d1571 in _dlerror_run (operate=operate@entry=0x7ffff74d0eb0 <dlopen_doit>, args=args@entry=0x7fffffffc8a0) at dlerror.c:163
#19 0x00007ffff74d0fa1 in __dlopen (file=file@entry=0x7fffffffc9a0 "libjulia.so", mode=<optimized out>) at dlopen.c:87
#20 0x00007ffff77296d7 in jl_dlopen (filename=filename@entry=0x7fffffffc9a0 "libjulia.so", flags=flags@entry=68) at /buildworker/worker/package_linux64/build/src/dlload.c:88
#21 0x00007ffff772993a in jl_load_dynamic_library_ (modname=0x7fffd8d8fd38 "libjulia", flags=68, throw_err=1) at /buildworker/worker/package_linux64/build/src/dlload.c:189
#22 0x00007ffff19b809d in julia_dlopen_23832 () at libdl.jl:97
#23 0x00007ffff1a78b7e in julia_dlopen_29612 () at libdl.jl:97
#24 0x00007ffff1a78b9e in jlcall_dlopen_29611 () from /opt/julia/lib/julia/sys.so
#25 0x00007ffff7713eea in jl_call_fptr_internal (fptr=<optimized out>, fptr=<optimized out>, nargs=<optimized out>, args=<optimized out>, meth=<optimized out>)
    at /buildworker/worker/package_linux64/build/src/julia_internal.h:339
#26 jl_call_method_internal (nargs=2, args=0x7fffffffcd50, meth=<optimized out>) at /buildworker/worker/package_linux64/build/src/julia_internal.h:358
#27 jl_apply_generic (args=args@entry=0x7fffffffcd50, nargs=nargs@entry=2) at /buildworker/worker/package_linux64/build/src/gf.c:1926
#28 0x00007ffff7728270 in do_call (args=0x7fffd8d99490, nargs=nargs@entry=2, s=s@entry=0x0) at /buildworker/worker/package_linux64/build/src/interpreter.c:75
#29 0x00007ffff77273af in eval (e=e@entry=0x7fffd8d8fdb0, s=s@entry=0x0) at /buildworker/worker/package_linux64/build/src/interpreter.c:242
#30 0x00007ffff77281b4 in jl_interpret_toplevel_expr (e=0x7fffd8d8fdb0) at /buildworker/worker/package_linux64/build/src/interpreter.c:34
#31 0x00007ffff773e89f in jl_toplevel_eval_flex (e=e@entry=0x7fffd8d8fcf0, expanded=expanded@entry=0, fast=1) at /buildworker/worker/package_linux64/build/src/toplevel.c:577
#32 0x00007ffff773e6f4 in jl_toplevel_eval (v=v@entry=0x7fffd8d8fcf0) at /buildworker/worker/package_linux64/build/src/toplevel.c:600
#33 0x00007ffff7721f88 in jl_toplevel_eval_in (m=0x7fffece38010, ex=0x7fffd8d8fcf0) at /buildworker/worker/package_linux64/build/src/builtins.c:496
#34 0x00007ffff187eaba in julia_eval_18160 () at boot.jl:235
#35 0x00007ffff187ead0 in jlcall_eval_18159 () from /opt/julia/lib/julia/sys.so
#36 0x00007ffff7713eea in jl_call_fptr_internal (fptr=<optimized out>, fptr=<optimized out>, nargs=<optimized out>, args=<optimized out>, meth=<optimized out>)
    at /buildworker/worker/package_linux64/build/src/julia_internal.h:339
#37 jl_call_method_internal (nargs=3, args=0x7fffffffd758, meth=<optimized out>) at /buildworker/worker/package_linux64/build/src/julia_internal.h:358
#38 jl_apply_generic (args=0x7fffffffd758, nargs=3) at /buildworker/worker/package_linux64/build/src/gf.c:1926
#39 0x00007ffff18ffd72 in julia_eval_user_input_20482 () at REPL.jl:66
#40 0x00007ffff18fff90 in jlcall_eval_user_input_20481 () from /opt/julia/lib/julia/sys.so
#41 0x00007ffff7713eea in jl_call_fptr_internal (fptr=<optimized out>, fptr=<optimized out>, nargs=<optimized out>, args=<optimized out>, meth=<optimized out>)
    at /buildworker/worker/package_linux64/build/src/julia_internal.h:339
#42 jl_call_method_internal (nargs=3, args=0x7fffffffd8f0, meth=<optimized out>) at /buildworker/worker/package_linux64/build/src/julia_internal.h:358
#43 jl_apply_generic (args=0x7fffffffd8f0, nargs=3) at /buildworker/worker/package_linux64/build/src/gf.c:1926
#44 0x00007fffdd02601f in macro expansion () at REPL.jl:97
#45 julia_#1_62779 () at event.jl:73
#46 0x00007fffdd0262b0 in jlcall_#1_62778 ()
#47 0x00007ffff7713eea in jl_call_fptr_internal (fptr=<optimized out>, fptr=<optimized out>, nargs=<optimized out>, args=<optimized out>, meth=<optimized out>)
    at /buildworker/worker/package_linux64/build/src/julia_internal.h:339
#48 jl_call_method_internal (nargs=1, args=0x7ffff08cc2f0, meth=<optimized out>) at /buildworker/worker/package_linux64/build/src/julia_internal.h:358
#49 jl_apply_generic (args=args@entry=0x7ffff08cc2f0, nargs=nargs@entry=1) at /buildworker/worker/package_linux64/build/src/gf.c:1926
#50 0x00007ffff772d1ab in jl_apply (nargs=1, args=0x7ffff08cc2f0) at /buildworker/worker/package_linux64/build/src/julia.h:1424
#51 start_task () at /buildworker/worker/package_linux64/build/src/task.c:267
#52 0x0000000000000000 in ?? ()

#11

So after many experiments, it seems the problem in my case is running julia

  • from /opt/julia - does not work,
  • from /home/julia - works
  • from /opt/julia on a clean virtual machine - works!
  • from /opt/julia on another physical linux machine - does not work!

Anyway, thanks to all of you for your time!


#12

That’s really strange. I tried to reproduce by moving my install to /opt, but no luck.

There might be some information in the errstring variable at frame 14 in the backtrace you posted (try frame 14 then p errstring). It looks like the static initializer is trying to run during cleanup from some other error.


#13
#14 0x00007ffff7de7564 in _dl_catch_error (objname=objname@entry=0x7fffffffc600, errstring=errstring@entry=0x7fffffffc608, mallocedp=mallocedp@entry=0x7fffffffc5ff, 
    operate=operate@entry=0x7ffff7dec4d0 <dl_open_worker>, args=args@entry=0x7fffffffc610) at dl-error.c:187
187     dl-error.c: No such file or directory.
(gdb) p errstring
$1 = (const char **) 0x7fffffffc608

#14

Whoops, how about:

(gdb) p *errstring

(this will dereference the char** once, and then gdb should automatically print the string contents)


#15

I thought it was not very informative :slight_smile:

#14 0x00007ffff7de7564 in _dl_catch_error (objname=objname@entry=0x7fffffffc600, errstring=errstring@entry=0x7fffffffc608, mallocedp=mallocedp@entry=0x7fffffffc5ff, 
    operate=operate@entry=0x7ffff7dec4d0 <dl_open_worker>, args=args@entry=0x7fffffffc610) at dl-error.c:187
187     in dl-error.c
(gdb) p *errstring
$6 = 0x7fffffffc748 ":8S\346\345\027\\"

#16

Looks like memory corruption so we can’t see anything there unfortunately. There’s a weird bug I hit earlier (you could check whether there are multiple libLLVM versions loaded, using pmap), but that should be excluded by your differential here.

Hopefully the work around is sufficient. If this becomes a blocker, feel free to ping me here or in the JuliaCall issue and I’ll try to take another look.


#17

When I have more time, I will try to reproduce it on the virtual machine by installing the same software as on the two physical machines with the issue. In the meantime, I can run julia from other dir than /opt/julia of course.

Thank you.