Discussion of ccall( (:function, "library"), ...)

Hi, I’d like to hear people’s thoughts about the use of ccall( (:function, "library"), ...) as opposed to ccall( :function, ...) (see Calling C and Fortran Code · The Julia Language) and whether the former variant should be discouraged or the underlying code changed.

The problem is that specifying the shared-object library is very non-C like and prevents one from using profilers and other tools that rely on LD_PRELOAD to inject code. When writing C or C++, the library for a function is not specified in the source code. The shared-object is dynamically linked into the code at runtime and a function for a particular call is found by searching through the exported functions from all such shared-objects. This method is equivalent to the Julia ccall( :function, ...) variant. If the user has specified a shared-object with the LD_PRELOAD environment variable at runtime, that library’s functions go to the top of the function search list, potentially overriding functions of the same name in other shared-objects. This is a common technique of profiler tools that, say, wrap code around malloc to track memory allocations.

Use of LD_PRELOAD works with Julia so long as you do not specify the library in ccall. See this gist for a simple example.

However, the Julia documentation encourages specifying the library with ccall( (:function, "library"), ...) and so tools that rely on LD_PRELOAD do not work. I find that many Julia packages that call C/C++ functions use this variant of ccall. My specific problem is that I want to use a tool called Darshan to profile HDF5 I/O from my Julia application. Darshan uses LD_PRELOAD to wrap MPI-IO and HDF5 C functions to add counters and timers. To make this work, I have my own versions of MPI.jl and HDF5.jl where I’ve changed the ccall statements to use the non-explicit-library variant (I’ve submitted PRs to both packages … for MPI.jl all ccall statements were already prefixed by a macro and the authors have a PR to change that macro to remove the library argument from the ccall).

I do understand the rationale behind ccall( (:function, "library"), ...) as it loads the shared object for you. I’m wondering if that can be changed so that the library is loaded with the correct options to make LD_PRELOAD work (e.g. on Linux on needs to do Libdl.dlopen(shared_object_file, RTLD_GLOBAL). Or, the documentation be changed to encourage the non-explicit-library variant. Or, at least have a warning that the explicit-library variant has some consequences.

I thought I’d bring this up here before making an issue or a PR to see if anyone else has thought about this concern.

Thanks! – Adam

3 Likes

I think the point of explicitly calling into the library is to avoid situations like

shell> cat foo.c
int test() {
    return 1;
}

shell> gcc -shared -fPIC -o foo.so foo.c

shell> cat bar.c
int test() {
    return 2;
}

shell> gcc -shared -fPIC -o bar.so bar.c

julia> dlopen("./foo.so", RTLD_LAZY | RTLD_DEEPBIND | RTLD_GLOBAL)
Ptr{Nothing} @0x000056086a4c5680

julia> dlopen("./bar.so", RTLD_LAZY | RTLD_DEEPBIND | RTLD_GLOBAL)
Ptr{Nothing} @0x000056086acf1530

julia> ccall(:test, Cint, ()) # can you guess which function will be called?

Libdl.dllist() will tell you the shared object search order.

C/C++ has the same behavior. I’m certainly not saying this behavior is a good idea or defending it, but it’s what C/C++ does and some tools take advantage of it.

1 Like

I mean, also a malicious library could do that.

Well, yes, but C/C++ has the same vulnerability. I think the thing that saves us is that it’s hard to trick people into setting the LD_PRELOAD environment variable that would allow injection of malicious code.

I will point out, perhaps pedantically, that this is not really a C/C++ feature, but more a feature of the underlying libc. This doesn’t really matter much to the discussion, but I wanted to point out that this feature was likely not in the minds of the C developers when they were initially designing the language; it’s tooling that has grown up around the dynamic linker on various platforms.

I do understand the rationale behind ccall( (:function, "library"), ...) as it loads the shared object for you.

I would argue that is not, in fact, the primary reason for ccall((:function, library)) (here I’m not making a distinction between library::String and library::Ptr{Cvoid}, such as the result from dlopen()). I would argue instead that the primary reason is so that ccall(), by default, ignores namespace pollution.

For the vast majority of C/C++ programs, which libraries you are going to load and what symbols they contain is a very static table. At compile time you list -lfoo -lbar and these get embedded into your executable’s import list, and this is not expected to change over time. From the compiler’s perspective, if I have two symbols with the same name trying to be embedded within the same namespace (e.g. the same shared library/executable) it will error, because it will create an unresolvable ambiguity. The dynamic linker doesn’t error and has platform-specific behavior; Linux, FreeBSD and MacOS all allow you to set it into a mode where the first symbol entered into the table wins, while Windows doesn’t maintain a global symbol table at all; you must scope your dlsym() queries to a specific namespace.

In practice, this often works quite well because the global namespace in C/C++ applications is a hand-crafted thing that developers more or less have control over. When you build an application, you know what is going to be going on within your namespace, and in the rare cases where you have, e.g. libfoo which is getting loaded within giant_application, and libfoo may conflict with libbar, some OS’s provide facilities such as multi-level namespaces that can solve this problem with specific flags given to dlopen().

Julia, on the other hand, generally has a much more hostile namespace environment. Users interactively run using Foo all over the place, the order in which libraries are loaded is not guaranteed, and in general the community acts like a giant fuzzer for library loading combinations. :slight_smile: It is quite common for us to run into symbol name clashes.

See, for instance, this issue, where if you use PyCall to load numpy on a Julia that uses MKL as its backing BLAS library, you get a segfault. This happens because MKL spits its symbols into the global symbol table, it provides ILP64 versions of those symbols, and Numpy is expecting a non-ILP64 version because it’s trying to call the functions in the BLAS library it shipped with. If numpy were to use the python equivalent of the ccall((:function, library), ...) syntax, it wouldn’t have these problems. (I will point out for completeness sake that we have requested Intel to change their naming policy, like we do with OpenBLAS to move ILP64 symbols to a different name such as dgemm64_ instead of just dgemm, but it’s an uphill battle).

In our experience, the incidence rate of users wanting to override symbol lookup through LD_PRELOAD is much, much lower than the incidence rate of two completely unrelated libraries containing conflicting symbol names. Additionally, since users are building larger and larger applications all the time, it becomes only more and more likely that the namespacing issue will become easier and easier to trigger.

Now, all that being said, the best of both worlds would be to allow explicit overloading of symbol resolution. It seems to me that one way this can be done is by using the facility in Julia 1.6 to parameterize the library argument to ccall. E.g. you could do the following:

$ cat foo.c
int always_foo() {
    return 1;
}

int sometimes_foo() {
    return 2;
}
$ cat bar.c
int sometimes_foo() {
    return 5040;
}
$ gcc -o foo.so -shared foo.c
$ gcc -o bar.so -shared bar.c
julia> function get_overridden_library(func_name::Symbol)
           if func_name in (:sometimes_foo,)
               return "./bar.so"
           end
           return "./foo.so"
       end

julia> ccall((:always_foo, get_overridden_library(:always_foo)), Cint, ())
1
julia> ccall((:sometimes_foo, get_overridden_library(:sometimes_foo)), Cint, ())
5040

Note that this only works on Julia v1.6+, in Julia v1.5- you can’t dynamically compute the library name. And for each ccall() site, your ccall() choice gets baked into the generated code, so you can’t dynamically switch at runtime; the result of the function that you called to determine the library path gets cached.

8 Likes

Thanks for your detailed and interesting post! The new features for dynamically computing the library name in Julia 1.6+ look interesting.

I’ll just point out two things… You wrote that C/C++ applications hand-craft the global namespace. Counter-examples are many of the C++ frameworks used in code to process and analyze High Energy Physics experiment data. Such frameworks, like art, rely on dynamic loading at runtime of many modules to produce, select, and analyze data. The framework code itself handles the dynamic loading, the I/O loop, and other infrastructure stuff while researchers write the modules with the science code. So in this sense, the application itself is not defined until runtime. Of course, name collisions are a problem and most frameworks will error if loading a shared object creates a name conflict.

On the other hand, LD_PRELOAD works “above” the application where naming collisions are intensional for tools that do profiling. We use such tools often to help us make our code performant. The nice thing about LD_PRELOAD is that you can use such tools without changing or recompiling the code to be profiled. Would be nice if that worked for Julia code too (e.g. like for what I’m trying to do by profiling HDF5 C calls).

Counter-examples are …

Good point, there are always exceptions to the rule.

Of course, name collisions are a problem and most frameworks will error if loading a shared object creates a name conflict.

A valid approach, but not the one we would ultimately choose if we can avoid it.

Would be nice if that worked for Julia code too

Yes… having thought about this a bit more and talked it over with @vchuravy on the Julia HPC community call today, I think it might not be too difficult to emulate LD_PRELOAD in Julia. Here’s how I would do it:

  • First, we could make this either a global or an opt-in semantic. If it’s global, it will need to be merged into Julia base, and you’ll need to convince more than just me that this is a good idea. ;). If it’s opt-in, it can be done in a package.

  • The basic idea is to reimplement the necessary logic of LD_PRELOAD, which, when you boil it down, becomes essentially:

function allow_ld_preload(func_name::Symbol, lib_path::String)
    preload_libs = split(get(ENV, "LD_PRELOAD", ""), ":")
    for prelib in preload_libs
        preload_lookup = dlsym(prelib, func_name; throw_error=false)
        if preload_lookup !== nothing
            return (func_name, prelib)
        end
    end
    return (func_name, lib_path)
end

Ignore for the moment the fact that dlsym() needs a handle and not a String, and that doing dlsym() again and again for each call site may not be the most efficient way to do this; in the end, this is quite a simple set of logic. For the opt-in usecase, this can be used to generate that libname argument on the fly:

ccall(allow_ld_preload(:foo, "libfoo"), ...)

For this to be the default, we’d have to do that equivalent logic within ccall() itself. That’s not impossible, but there may be difficulties I haven’t thought of yet. Also, the benefit of doing this in Julia itself is that this approach would work on all platforms (including Windows), so we could become the first programming language to support LD_PRELOAD on Windows! :wink:

1 Like

Thanks! Sorry I missed the Julia HPC call (I always have a conflicting meeting over that time). Would be nice to somehow cache the library result (sounds like in Julia 1.6 this will be possible).

In general, I think this is a good approach. Though the “spirit” of LD_PRELOAD is that you don’t have to change your code to bring in the profiler or whatever.

A suggestion in my HDF5 PR from @musm was to use Cassette.jl to do the injection (e.g. remove the library specification from ccall statements - or change the library name to the one in LD_PRELOAD - I guess like the macro in MPI.jl but could go deeper). I like that idea, because package code wouldn’t need to change. Though I wonder what the performance implications would be. I also don’t know enough about Cassette.jl to try something like that. But from what I’ve seen I think it is possible. Would be interested to hear thoughts on that approach.