Opening two interdependent .so files in Libdl

I’m trying to wrap a c++ library using CxxWrap.jl. I’ve built the libs with BinaryBuilder (https://github.com/jstrube/FastJetBuilder/releases), and the wrapper code compiles (https://github.com/jstrube/FastJetWrapBuilder/releases), but when I try to load the libraries into julia, it fails.
In particular:

julia> Libdl.dlopen("libfastjettools.so.0.0.0")
ERROR: could not load library "libfastjettools.so.0.0.0"
libfastjettools.so.0.0.0: undefined symbol: _ZTIN7fastjet25ClusterSequenceActiveAreaE
Stacktrace:
 [1] dlopen(::String, ::UInt32) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/Libdl/src/Libdl.jl:97 (repeats 2 times)
 [2] top-level scope at none:0

I have found that symbol in another lib:

julia> x = Libdl.dlopen("libfastjet.so.0.0.0")
Ptr{Nothing} @0x0000000002112180

julia> Libdl.dlsym(x, :_ZTIN7fastjet25ClusterSequenceActiveAreaE)
Ptr{Nothing} @0x00007fb3814f61a8

I thought that loading that library first, and then the other one should fix the problem, but it doesn’t. Regardles of whether or not I load libfastjet, loading libfastjettools fails. Compiling the C++ example (http://www.fastjet.fr/quickstart.html) with the same libraries works fine…
What do I need to do to tell Julia where to find the missing symbol?

I am not sure that I could help but I will try and have some questions.

Do you have modified Libdl.DL_LOAD_PATH?

Couldn’t help to modify ENV["LD_LIBRARY_PATH"] to include path to libfastjet.so.0.0.0 and then call Libdl.dlopen("libfastjettools.so.0.0.0")?

(and maybe modifying LD_LIBRARY_PATH before calling julia could help)

Where are these libraries?

Without checking the actual compilation command or the compiled binary, the first issue seems to be that libfastjettools.so isn’t linked to it’s dependency directly. Once that’s fixed, there should be no problem.
Or at least if it’s linked but couldn’t find the dependency, the error will be different. (in another word, you do not have a path issue, not at this point).

The reason dlopen twice doesn’t work as a workaround is because on linux the default flag is RTLD_LOCAL. You need RTLD_GLOBAL to make the symbol from one lib be accessible to the another. See the help of dlopen.

1 Like

Thanks for your input. That is correct, libfastjettools is not linked to libfastjet. Maybe I can file a bug report about that with the fastjet folks… I’ll try your suggestion with the flags later today. Would still have to figure out how to forward that to BinaryProvider, though. Will try fixing upstream first.

I had to work through this as well, because RingBuffers.jl provides a C library providing functions that other packages can call from their C code, e.g. PortAudio.jl. You can see here where the ringbuffer library is opened with the RTLD_GLOBAL flag.

Thanks for chiming in. You can find the libs for your architecture under the first link I sent.
In this case, the path is not the problem. The lib is found, but it cannot be opened due to the missing symbols.

1 Like

Thanks! I’ll try those flags, but I’m looking for a solution that’ll work with BinaryProvider. I don’t see a way to forward those flags to the LibraryProduct, currently, but maybe I’m missing something.
So it looks like I’ll need to file a bug with either the C++ code package or BinaryProvider. Will try in that order.

if you look at the build.jl you can see I’m still using BinaryProvider for all the platform detection and binary selection, I’m just not using the built-in library opening. It’s possible that this could be a useful new feature for BinaryProvider, but currently this approach still takes advantage of BinaryProvider for all the hard stuff.

1 Like

Your SO files have (incorrect) absolute paths set for their link search lists (RPATH):

readelf -d libfastjettools.so

Dynamic section at offset 0x45b40 contains 27 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x000000000000000e (SONAME)             Library soname: [libfastjettools.so.0]
 0x000000000000000f (RPATH)              Library rpath: [/opt/x86_64-linux-gnu/x86_64-linux-gnu/lib/../lib64]
...

BinaryBuilder is supposed to fix this sort of thing by magically running something like

patchelf --set-rpath '$ORIGIN' libfastjettools.so

which would tell your SOs where to find each other. Let’s ask @staticfloat what to do.

That would be great! If you have a suggestion for how to fix the automake script upstream, that would also be useful. Although I’m extremely curious how this would be fixed for other architectures… This is just a bit outside of my area of expertise.

Yichao’s workaround does indeed work (note that the default linker flags are different on different platforms, so e.g. this is not needed on OSX):

julia> using Libdl

julia> h = dlopen("./libfastjet.so", RTLD_LAZY|RTLD_DEEPBIND|RTLD_GLOBAL)
Ptr{Nothing} @0x000000000173ca30

julia> h2 = dlopen("./libfastjettools.so")
Ptr{Nothing} @0x0000000001755840

This is because the linking step for libfastjettools doesn’t declare a dependency on libfastjet. I believe this is by design; the source code doesn’t look to me like it’s an oversight; the designers actually want it to work like this, which is kind of weird but, alright.

Valid solutions to this are to manually dlopen as suggested by Yichao and Spencer. Also valid is to glom all the libraries together into a single .so, since that seems like the simplest way to get this usable by Julia.

Thanks. If that’s the recommended solution, I can work with that.
On a somewhat related note: I’m using CxxWrap for the bindings, which needs the julia lib. For some reason, the wrapper library (second link in my first post) gets linked to …/lib/libjulia instead of libjulia without the relative paths.
Is there something I can do about this resolution? I’ve just copied my setup from LCIOWrapBuilder, which has worked fine for me, so I don’t understand where the difference comes from. cc @barche

I don’t see the problem either, I’d suggest adding the --verbose parameter to build-tarballs.jl in the travis file here and add VERBOSE=ON before the build command in the build script, to see the full commands.

Thank you for the suggestion. In fact, there’s actually an error, but the build still claims success and makes a tarball.
From: https://travis-ci.com/jstrube/FastJetWrapBuilder/jobs/146539049#L765

ERROR: could not load library "/home/travis/build/build/x86_64-linux-gnu/vD3t0oyQ/destdir/lib/libfastjetwrap.so"
../julia-1.0.0/lib/libjulia.so.1: cannot open shared object file: No such file or directory

Looking further, https://travis-ci.com/jstrube/FastJetWrapBuilder/jobs/146539049#L757-L758 suggests that somehow julia was installed in the wrong place. I’m pretty sure I’m using the same command to extract julia as in the package that succeeds, but somehow this is not installing into /usr/local/lib. Will investigate further…

But https://travis-ci.com/jstrube/FastJetWrapBuilder/jobs/146539049#L733 says

/opt/x86_64-linux-gnu/bin/x86_64-linux-gnu-g++ --sysroot=/opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/ -fPIC   -shared -Wl,-soname,libfastjetwrap.so -o libfastjetwrap.so CMakeFiles/fastjetwrap.dir/src/fastJetWrap.cc.o -Wl,-rpath,/workspace/destdir/lib:/usr/local/lib: /workspace/destdir/lib/libcxxwrap_julia.so.0.4.0 /usr/local/lib/libjulia.so

So, that’s definitely linking the right lib. Where does ../julia-1.0.0/lib come into play?

I think it comes from the binary dependency on Julia here. This is not used on LCIO. I’m not sure if that breaks because you are still downloading Julia manually also. It’s nice to see the dependency exists now, I should update libcxxwrap-julia to use it, then maybe you don’t need to add it explicitly because it will be pulled as a dependency for libcxxwrap-julia? CC @staticfloat

Ahh…Interesting… I hadn’t thought of that. I could try removing that, then. But it’s linking in the libjulia from /usr/local/lib correctly in cmake, and all the other steps also find libjulia in the right location (e.g. this). Then the auto-mapping comes in and changes the path to the wrong location. Filed a bug report here https://github.com/JuliaPackaging/BinaryBuilder.jl/issues/354, since this is kind of a different topic from the original one.