Hi, I’d like to hear people’s thoughts about the use of ccall( (:function, "library"), ...)
as opposed to ccall( :function, ...)
(see Calling C and Fortran Code · The Julia Language) and whether the former variant should be discouraged or the underlying code changed.
The problem is that specifying the shared-object library is very non-C like and prevents one from using profilers and other tools that rely on LD_PRELOAD
to inject code. When writing C or C++, the library for a function is not specified in the source code. The shared-object is dynamically linked into the code at runtime and a function for a particular call is found by searching through the exported functions from all such shared-objects. This method is equivalent to the Julia ccall( :function, ...)
variant. If the user has specified a shared-object with the LD_PRELOAD
environment variable at runtime, that library’s functions go to the top of the function search list, potentially overriding functions of the same name in other shared-objects. This is a common technique of profiler tools that, say, wrap code around malloc
to track memory allocations.
Use of LD_PRELOAD
works with Julia so long as you do not specify the library in ccall
. See this gist for a simple example.
However, the Julia documentation encourages specifying the library with ccall( (:function, "library"), ...)
and so tools that rely on LD_PRELOAD
do not work. I find that many Julia packages that call C/C++ functions use this variant of ccall
. My specific problem is that I want to use a tool called Darshan to profile HDF5 I/O from my Julia application. Darshan uses LD_PRELOAD
to wrap MPI-IO and HDF5 C functions to add counters and timers. To make this work, I have my own versions of MPI.jl
and HDF5.jl
where I’ve changed the ccall
statements to use the non-explicit-library variant (I’ve submitted PRs to both packages … for MPI.jl
all ccall
statements were already prefixed by a macro and the authors have a PR to change that macro to remove the library argument from the ccall
).
I do understand the rationale behind ccall( (:function, "library"), ...)
as it loads the shared object for you. I’m wondering if that can be changed so that the library is loaded with the correct options to make LD_PRELOAD
work (e.g. on Linux on needs to do Libdl.dlopen(shared_object_file, RTLD_GLOBAL)
. Or, the documentation be changed to encourage the non-explicit-library variant. Or, at least have a warning that the explicit-library variant has some consequences.
I thought I’d bring this up here before making an issue or a PR to see if anyone else has thought about this concern.
Thanks! – Adam