I’ve been messing around with registering callbacks in a C-library from Julia and ran into some things I cannot explain. The main thing I do not understand is that my code works fine from a Julia script that I either run in the REPL line by line or by calling julia myfile.jl
but when I put the code inside of a Module I get segfaults when the callback is invoked.
I was also wondering what is able to actually be called inside of the callback? I’ve seen some threads or old docs eluding to the fact that you cannot do I/O or interact with the Julia runtime from your callback, but I am perfectly able to call GC.gc() and also println in my working test. So I was wondering are there actually limitations here?
The “manual” version basically dlopen
s the c-library, registers the callback and then triggers the callback. The version wrapped in a module, registers the callback inside of init and there is a separate function I can call to trigger the callback. I also have a version where the callback is invoked from a fresh pthread that is external to Julia, which only adds to the complexity.
Any help would be appreciated. Thanks!
I do not have a clear picture of what you are doing exactly. To me it sounds more like a GC issue than the question of what can be part of a callback.
One can probably not read the corresponding documentation often enough. Especially take note of the fact that cconvert
won’t protect you as the registering of a callback is expected to return and the problem typically happens after this return.
It means that when you do this in a local scope (e.g. inside a method) every local object you pass to the C
library when registering the callback which is not used
- afterwards in this method
- on the call stack
- the
ccall
of registering the callback does not count as it returns
- be aware that after registering the callback the callstack might be popped several times before you actually call the code which will eventually call into your callback
- in global scope
is free to be garbage collected (and in my experience will be garbage collected more often than not and then quite reliably). This is especially true for any lambda or C
function pointer which you expected to be called back from the C
library.
The behavior in the REPL is often misleading as so much happens in global scope and therefore the corresponding objects are protected there and you’ll only observe your bug when run outside the REPL.
The recommended solution is to store the corresponding objects in global scope (inside of some const
mutable type). However, I made good experience with building Julia data types around whatever you logically want to do with your C library and add the C library management data (like a function pointer) as additional fields to these data structures next to the “Julia fields”.
This way you can create multiple objects of your data type resulting in multiple callbacks without needing to manually handle your global data structure. However, you obviously need to ensure to free your resources correctly. do
blocks are especially valuable for this in my experience. finalizer
s sound great at the beginning, but aren’t so much in my experience.
Can you post your code or a MWE?
Recent versions of Julia have been lifting a lot of those old limitations.
I am using a const global Ref to store the handle to the c-function so I think I should avoid most GC issues. I’ll look into a pattern with the structs as well so lifetimes are more explicitly managed.
I do agree that the REPL version is probably misleading, so I have copied the relevant portions of the code below for the module version as that is where I ultimately want to get things working anyway. Besides the seg-fault, I cannot use pthread_join
in my invoke_callback_threaded
as that thread just never seems to join even if the callback is called.
Yea I saw some threads mentioning this and also the PR for external threads and your library for ForeignCallbacks. I could not figure out what the “recommended” pattern for Julia >= 1.9 is though. Its unclear if the AsyncCondition stuff is still needed or if those docs are just not updated.
C-Code
static struct {
int initialized;
int counter;
void* callback_ptr;
} lib_state = {0, 0, NULL};
int callback_register(void* func_ptr) {
lib_state.callback_ptr = func_ptr;
lib_state.initialized = 1;
return 0;
}
void* invoke_callback(void* func_ptr){
((void (*)(void*))func_ptr)(NULL);
return NULL;
}
int invoke_callback_threaded(void* func_ptr){
pthread_t thread;
if (pthread_create(&thread, NULL, invoke_callback, func_ptr) != 0) {
perror("pthread_create failed");
return -1;
}
// pthread_join(thread, NULL);
pthread_detach(thread);
return 0;
}
Julia code
const lib_path = joinpath(@__DIR__,"../../build/libcallback.so")
const callback_fn = Ref{Ptr{Cvoid}}()
function register_callback()
result = @ccall lib_path.callback_register(callback_fn::Ptr{Cvoid})::Cint
if result != 0
error("Failed to register callback")
end
end
function my_callback()
GC.gc(false)
println("in here")
return nothing
end
function __init__()
precompiling = ccall(:jl_generating_output, Cint, ()) != 0
if !precompiling
callback_fn[] = @cfunction(my_callback, Nothing, (Ptr{Cvoid},))
register_callback()
end
end
Edit: The functions I have to trigger the callback are much more involved. Can send if needed.
This is saying that my_callback
takes one argument, a Ptr{Cvoid}
, but your mycallback
function takes no arguments.
2 Likes
Ah good catch, I also forgot to put the []
when registering my callback. I do not get a segfault anymore!!
The pthread on the C-side still blocks and never re-joins even though. It appears that the callback has completed successfully (at least based on print statements). Detaching the pthread “works” but feels incorrect and I am hoping to figure out why I cannot join the pthread. Is there something about that thread calling into Julia that affects its ability to re-join the main process.
It might be some weirdness having to do with thread adoption and garbage collection? See Extreme Multi-Threading: C++ and Julia 1.9 Integration
1 Like
Thanks, this is a great resource! The C-library I want to wrap is unlikely to add calls to the Julia C-API internally, so I might need a different pattern if I want things to be safe as I’ll never be able to manually call things like jl_adopt_thread
in my C-code. Although the adoption should be automatic since I am entering C through @cfunction
.
Edit: I’m pretty sure its because Julia adopts the thread when and join will never actually complete until the Julia runtime ends.
It’s not necessary anymore. Native callbacks can now put!
into a Channel
which the Julia-side can take!
from so you don’t need ForeignCallbacks.jl anymore. I believe that Channel must be buffered so the put!
doesn’t try to yield but not 100% on that
1 Like
@Octogonapus are you aware of any examples of this pattern? Most code bases I’ve seen this in still have the AsyncCondition (e.g CUDA.jl).
I’m pretty sure I cannot use the cfunction approach as I do not want Julia to adopt the thread so I need this other mechanism which is just pure communication.
This put!
and take!
pattern is nice, but I still have the issue that ultimately somewhere in C++ land a @cfunction
is called to trigger the callback which causes Julia to automatically adopt that thread. Is that really not avoidable? I’d like my C code to keep on executing independent from the Julia runtime cause in practice I can’t actually go into the C code and spawn a new thread that Julia is allowed to steal.
Its also unclear to me what the implications of Julia “adopting” a thread are. At a minimum I’d like to avoid the overhead of setting up a stack/heap for that thread , but more importantly I want to know the effects on whatever C-code is downstream from the callback on that thread.
1 Like
Julia adopts a foreign thread when entering into code on that thread via cfunction
or a @ccallable
entry point. Thread adoption includes creating a thread ID, heap, stack limits, and root task. Thread adoption waits while the GC is running.
If you want your C thread to not be adopted, then don’t use cfunction
or @ccallable
. Additionally, you will need to use a pattern like in GitHub - vchuravy/ForeignCallbacks.jl because you won’t be able to interact with the runtime from within your native function.
All that said, I don’t quite understand why thread adoption is a barrier to your work. Julia doesn’t do much with the thread; it just sets up the bare minimum to get the runtime working. AFAIK other Julia tasks can’t be scheduled on adopted threads but @vchuravy please correct me if I’m wrong.
1 Like
I am ok with the thread being adopted if there are no side effects on downstream C code after that thread is adopted. Its just not clear to me what “adoption” means beyond that overhead of the stack, heap etc..
Like two things I’d like to know:
- Is it a safe pattern to create a new pthread to invoke the callback and then
pthread_detach
and let Julia do whatever it wants with the thread after invoking the callback?
- Can the C-library still use that thread without any problems or will Julia now also start scheduling things into that thread at the same time as the C-library. Like are there effectively two owners now? From what I’ve tested C-code can still be run on that thread I just want to be aware of other costs I might be incurring.