Ccall caused abort when the c library depends on both protobuf and tcmalloc


#1

I have a dynamic C library and I use ccall to call functions in it. Ccall aborts due to “attempt to free invalid pointer xxx” when the library has a dependency on protobuf and tcmalloc. Things seemed to work fine when I tried dlopen that library in my own c program (no julia involved) or when I removed either the tcmalloc or the protobuf dependency (ccall from Julia). Would you please suggest how I should debug this issue?

Thanks a lot in advance!

Software versions:
Julia 0.5.1
Protobuf 2.6.1

Below is the minimal example where this problem happens:


#2

What does it show in debugger (build with -g). How is tcmalloc linked and what symbols does it provide? What if you do exactly what you do in the C program (using Libdl.dlopen and Libdl.dlsym) instead?

It seems most likely that you are trying to link two malloc implementations in the same program. This won’t work very well unless you makes sure they don’t see each other (including the memory allocated by each other).


#3

Thanks for your prompt response!

Here is what’s showed if I replay the core dump in gdb:

[New LWP 2454]
[New LWP 2458]
[New LWP 2455]
[New LWP 2456]
[New LWP 2457]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/home/ubuntu/julia-0.5.1/julia helloworld.jl'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f5e0a02a269 in raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/pt-raise.c:35
35      ../sysdeps/unix/sysv/linux/pt-raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f5e0ae5bc00 (LWP 2454))]
(gdb) backtrace
#0  0x00007f5e0a02a269 in raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/pt-raise.c:35
#1  0x00007f5e0a6cb3c8 in sigdie_handler (sig=6, info=<optimized out>, context=0x7ffcf831e4c0) at /home/ubuntu/julia-0.5.1/src/signals-unix.c:84
#2  <signal handler called>
#3  0x00007f5e09c85428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#4  0x00007f5e09c8702a in __GI_abort () at abort.c:89
#5  0x00007f5bf36ec64e in tcmalloc::Log(tcmalloc::LogMode, char const*, int, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem, tcmalloc::LogItem) () from /usr/lib/libtcmalloc.so.4
#6  0x00007f5bf36e07df in ?? () from /usr/lib/libtcmalloc.so.4
#7  0x00007f5bf370370d in tc_deletearray () from /usr/lib/libtcmalloc.so.4
#8  0x00007f5bf36f4cac in MallocExtension::Initialize() () from /usr/lib/libtcmalloc.so.4
#9  0x00007f5bf36df825 in ?? () from /usr/lib/libtcmalloc.so.4
#10 0x00007f5e0ac5b4ea in call_init (l=<optimized out>, argc=argc@entry=2, argv=argv@entry=0x7ffcf8322a28, env=env@entry=0x2fc61b0) at dl-init.c:72
#11 0x00007f5e0ac5b5fb in call_init (env=0x2fc61b0, argv=0x7ffcf8322a28, argc=2, l=<optimized out>) at dl-init.c:30
#12 _dl_init (main_map=main_map@entry=0x2fbd3c0, argc=2, argv=0x7ffcf8322a28, env=0x2fc61b0) at dl-init.c:120
#13 0x00007f5e0ac60712 in dl_open_worker (a=a@entry=0x7ffcf831f050) at dl-open.c:575
#14 0x00007f5e0ac5b394 in _dl_catch_error (objname=objname@entry=0x7ffcf831f040, errstring=errstring@entry=0x7ffcf831f048, mallocedp=mallocedp@entry=0x7ffcf831f03f, operate=operate@entry=0x7f5e0ac60300 <dl_open_worker>, args=args@entry=0x7ffcf831f050) at dl-error.c:187
#15 0x00007f5e0ac5fbd9 in _dl_open (file=0x7f5c075ace50 "/home/ubuntu/test_protobuf/libtest.so", mode=-2147483639, caller_dlopen=0x7f5e0a698ae2 <jl_load_dynamic_library_+514>, nsid=-2, argc=<optimized out>, argv=<optimized out>, env=0x2fc61b0) at dl-open.c:660
#16 0x00007f5e0a43ef09 in dlopen_doit (a=a@entry=0x7ffcf831f280) at dlopen.c:66
#17 0x00007f5e0ac5b394 in _dl_catch_error (objname=0x14a1360, errstring=0x14a1368, mallocedp=0x14a1358, operate=0x7f5e0a43eeb0 <dlopen_doit>, args=0x7ffcf831f280) at dl-error.c:187
#18 0x00007f5e0a43f571 in _dlerror_run (operate=operate@entry=0x7f5e0a43eeb0 <dlopen_doit>, args=args@entry=0x7ffcf831f280) at dlerror.c:163
#19 0x00007f5e0a43efa1 in __dlopen (file=<optimized out>, mode=mode@entry=9) at dlopen.c:87
#20 0x00007f5e0a6988dc in jl_dlopen (filename=<optimized out>, flags=flags@entry=68) at /home/ubuntu/julia-0.5.1/src/dlload.c:83
#21 0x00007f5e0a698ae2 in jl_load_dynamic_library_ (modname=modname@entry=0x7f5c075ace50 "/home/ubuntu/test_protobuf/libtest.so", flags=flags@entry=68, throw_err=throw_err@entry=1) at /home/ubuntu/julia-0.5.1/src/dlload.c:150
#22 0x00007f5e0a698c9a in jl_load_dynamic_library (modname=modname@entry=0x7f5c075ace50 "/home/ubuntu/test_protobuf/libtest.so", flags=flags@entry=68) at /home/ubuntu/julia-0.5.1/src/dlload.c:223
#23 0x00007f5e0a6bed24 in jl_get_library (f_lib=f_lib@entry=0x7f5c075ace50 "/home/ubuntu/test_protobuf/libtest.so") at /home/ubuntu/julia-0.5.1/src/runtime_ccall.cpp:152
#24 0x00007f5e0a710c1c in emit_ccall (args=args@entry=0x7f5c077243d0, nargs=nargs@entry=3, ctx=ctx@entry=0x7ffcf83213a0) at /home/ubuntu/julia-0.5.1/src/ccall.cpp:1745
#25 0x00007f5e0a6f7101 in emit_intrinsic (f=JL_I::ccall, args=args@entry=0x7f5c077243d0, nargs=nargs@entry=3, ctx=ctx@entry=0x7ffcf83213a0) at /home/ubuntu/julia-0.5.1/src/intrinsics.cpp:932
#26 0x00007f5e0a6f94ef in emit_call (ex=ex@entry=0x7f5c077309f0, ctx=ctx@entry=0x7ffcf83213a0) at /home/ubuntu/julia-0.5.1/src/codegen.cpp:2719
#27 0x00007f5e0a6fa1d1 in emit_expr (expr=0x7f5c077309f0, ctx=ctx@entry=0x7ffcf83213a0) at /home/ubuntu/julia-0.5.1/src/codegen.cpp:3172
#28 0x00007f5e0a709105 in emit_function (lam=lam@entry=0x7f5c0759f730, declarations=declarations@entry=0x7f5c0759f7a8) at /home/ubuntu/julia-0.5.1/src/codegen.cpp:4693
#29 0x00007f5e0a70a5c7 in jl_compile_linfo (li=li@entry=0x7f5c0759f730) at /home/ubuntu/julia-0.5.1/src/codegen.cpp:809
#30 0x00007f5e0a67ee58 in jl_compile_for_dispatch (li=li@entry=0x7f5c0759f730) at /home/ubuntu/julia-0.5.1/src/gf.c:1313
#31 0x00007f5e0a6af411 in jl_call_method_internal (nargs=1, args=0x7ffcf8321a98, meth=0x7f5c0759f730) at /home/ubuntu/julia-0.5.1/src/julia_internal.h:205
#32 jl_toplevel_eval_flex (e=<optimized out>, fast=fast@entry=1, expanded=expanded@entry=1) at /home/ubuntu/julia-0.5.1/src/toplevel.c:569
#33 0x00007f5e0a68802e in jl_parse_eval_all (fname=fname@entry=0x7f5c077289d0 "/home/ubuntu/test_protobuf/helloworld.jl", content=content@entry=0x0, contentlen=contentlen@entry=0) at /home/ubuntu/julia-0.5.1/src/ast.c:717
#34 0x00007f5e0a6af543 in jl_load (fname=0x7f5c077289d0 "/home/ubuntu/test_protobuf/helloworld.jl") at /home/ubuntu/julia-0.5.1/src/toplevel.c:596
#35 0x00007f5e0a6af5f8 in jl_load_ (str=<optimized out>) at /home/ubuntu/julia-0.5.1/src/toplevel.c:605
#36 0x00007f5e054ceead in julia_include_from_node1_20311 (_path=...) at loading.jl:488
#37 0x00007f5e054cf0ac in jlcall_include_from_node1_20311 () from /home/ubuntu/julia-0.5.1/usr/lib/julia/sys.so
#38 0x00007f5e0a67e760 in jl_call_method_internal (nargs=2, args=0x7ffcf8322270, meth=0x7f5c06214fd0) at /home/ubuntu/julia-0.5.1/src/julia_internal.h:210
#39 jl_apply_generic (args=0x7ffcf8322270, nargs=<optimized out>) at /home/ubuntu/julia-0.5.1/src/gf.c:1950
#40 0x00007f5e054f2121 in julia_process_options_21671 (opts=...) at client.jl:265
#41 0x00007f5e054f3c16 in julia__start_21662 () at client.jl:321
#42 0x00007f5e054f4629 in jlcall.start_21662 () from /home/ubuntu/julia-0.5.1/usr/lib/julia/sys.so
#43 0x00007f5e0a67e760 in jl_call_method_internal (nargs=1, args=0x7ffcf83227e0, meth=0x7f5c0661c1c0) at /home/ubuntu/julia-0.5.1/src/julia_internal.h:210
#44 jl_apply_generic (args=args@entry=0x7ffcf83227e0, nargs=nargs@entry=1) at /home/ubuntu/julia-0.5.1/src/gf.c:1950
#45 0x0000000000401add in jl_apply (nargs=1, args=0x7ffcf83227e0) at /home/ubuntu/julia-0.5.1/ui/../src/julia.h:1392
#46 true_main (argc=1, argv=0x7ffcf8322a30) at /home/ubuntu/julia-0.5.1/ui/repl.c:123
#47 0x0000000000401397 in main (argc=1, argv=0x7ffcf8322a30) at /home/ubuntu/julia-0.5.1/ui/repl.c:243

tcmalloc is dynamically linked. Below is all the dependencies of my c library. It provides many symbols (not sure which ones you are looking for), including “malloc”.

    linux-vdso.so.1 =>  (0x00007ffe631fd000)
    libprotobuf.so.9 => /usr/lib/x86_64-linux-gnu/libprotobuf.so.9 (0x00007fc201de5000)
    libtcmalloc.so.4 => /usr/lib/libtcmalloc.so.4 (0x00007fc201b74000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fc20195d000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc201594000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fc201377000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fc20115c000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fc200dda000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fc200ad1000)
    libunwind.so.8 => /usr/lib/x86_64-linux-gnu/libunwind.so.8 (0x00007fc2008b5000)
    /lib64/ld-linux-x86-64.so.2 (0x0000563d23d8f000)
    liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007fc200693000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fc20048e000)

Calling Libdl.dlopen like lib = Libdl.dlopen("/home/ubuntu/test/libtest.so") crashed for the same error.


#4

I guess what you meant was when I dynamically load tcmalloc, it conflicts with the malloc Julia uses? But the program worked when I kept tcmalloc but removed the dependency on protobuf.


#5

You are not using the same flags as you are using in C.

Conflicts with libc.

This can easily happen due to linking.


#6

Sorry that I forgot to mention - I think the default flags for Julia’s Libdl.dlopen is RTLD_LAZY | RTLD_LOCAL | RTLD_DEEPBIND so I tried that in my c program too, which worked (this way I don’t have to figure out the integer values for those macros to use them in Julia).

Could you please elaborate a little bit on how I can “makes sure they don’t see each other”? I don’t have much clue right now. For example, if I set LD_PRELOAD to tcmalloc when running Julia, would it make Julia use tcmalloc for its malloc too?

Also, for Libdl.dlsym which returns the function pointer, how may I call that function in Julia?


#7

It seems LD_PRELOAD does solve the problem.