Segfault; misuse of GC?

Hello, I have had issues with using the garbage collector in the past and when I recently tried to use the new 1.1 GC extensions, I noticed that I couldn’t even get rid of a segfault with the old system.

I swear code at least very similar to this has worked in the past:

int main()
{
  jl_init();
  
  jl_value_t *fn = 0;
  jl_value_t *arg1 = 0;
  jl_value_t *arg2 = 0;
  JL_GC_PUSH3(&fn, &arg1, &arg2);
  
  fn = jl_eval_string(".+");
  arg1 = jl_eval_string("3");
  arg2 = jl_eval_string("[1., 3.4, 5]");
  
  jl_call2(fn, arg1, arg2);
  JL_GC_POP();

  jl_atexit_hook(0);
}

(adapted from How to use `GC_PUSH` macros - #2 by GunnarFarneback).

The error is

signal (11): Segmentation fault
in expression starting at no file:0
jl_typemap_assoc_exact at /buildworker/worker/package_linux64/build/src/julia_internal.h:922 [inlined]
jl_lookup_generic_ at /buildworker/worker/package_linux64/build/src/gf.c:2159 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2205
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1571 [inlined]
jl_call2 at /buildworker/worker/package_linux64/build/src/jlapi.c:227
main at ./gcext (unknown line)
__libc_start_main at /usr/lib/libc.so.6 (unknown line)
_start at ./gcext (unknown line)
Allocations: 52383 (Pool: 52362; Big: 21); GC: 0
``

I am compiling with both `-fPIC and -DJULIA_ENABLE_THREADING=1`.

If instead of an array I just use a number, everything works fine, which leads me to believe this is a GC issue rather than incorrect compilation.

Thanj you!

Not in a position to test now but check for null

I’m rather suspicious about this line. Are you sure it returns something sensible? Notice that “.+” is not an operator by itself but syntax for invoking the broadcast machinery.

For reference:

julia> Meta.eval(Meta.parse("+"))
+ (generic function with 163 methods)

julia> Meta.eval(Meta.parse(".+"))
ERROR: UndefVarError: .+ not defined
Stacktrace:
 [1] top-level scope
 [2] eval at ./boot.jl:319 [inlined]
 [3] eval(::Symbol) at ./meta.jl:6
 [4] top-level scope at none:0

julia> ccall(:jl_eval_string, Any, (Cstring,), "+")
+ (generic function with 163 methods)

julia> ccall(:jl_eval_string, Any, (Cstring,), ".+")

signal (11): Segmentation fault
in expression starting at no file:0
jl_f_tuple at /home/gunnar/julia1.0/src/builtins.c:677
eval_user_input at /home/gunnar/julia1.0/usr/share/julia/stdlib/v1.0/REPL/src/REPL.jl:89
macro expansion at /home/gunnar/julia1.0/usr/share/julia/stdlib/v1.0/REPL/src/REPL.jl:117 [inlined]
#28 at ./task.jl:259
jl_apply_generic at /home/gunnar/julia1.0/src/gf.c:2184
jl_apply at /home/gunnar/julia1.0/src/julia.h:1537 [inlined]
start_task at /home/gunnar/julia1.0/src/task.c:268
unknown function (ip: 0xffffffffffffffff)
Allocations: 1007085 (Pool: 1006879; Big: 206); GC: 1
Segmentation fault (core dumped)

Yes .+ was originally a function so this would have worked in an older version (last working version would be around julia-0.5 I think). As @GunnarFarneback pointed out, the .+ is now special syntax rather than a function name and is an error when parsed in isolation. So fn == NULL and you get a segfault. You can use jl_exception_occurred to determine whether and which exception occurred in a call to the julia C API.

So all this is actually a distraction and unrelated to the new GC extensions. (The gc rooting in your example looks fine to me fwiw.)

Ok right. I thought this was fine, because it is usable as a function in Julia.

PS: Now I have to find cases where rooting is actually needed…

It’s pretty easy to find cases where rooting is necessary, though slightly more tricky to demonstrate very clearly what is going on.

The following is a fairly minimal example:

#include "julia.h"

int main()
{
    jl_init();

    jl_value_t *x = NULL;
    // JL_GC_PUSH1(&x); // Uncomment to remove the segfault
    x = jl_eval_string("1.1");
    jl_eval_string("GC.gc()"); // Will collect the value pointed to by `x` if you don't root it.
    // following will segfault (on my machine) if `x` has been collected
    jl_call1(jl_eval_string("println"), x);

    jl_atexit_hook(0);
    return 0;
}
1 Like

Thanks @c42f! I was just searching for such example to show how @yuyichao’s suggestion to create a reference to the variable in a IdDict by a RefValue{Any} solves the problem. This is particularly useful if you want to keep the pointer between function calls (as described in the PR https://github.com/JuliaLang/julia/pull/30399).

The following example can be used also to protect x from GC (not very useful in this case, but exemplifies what to do when you want to protect it between function calls):

#include "julia.h"

int main()
{
    jl_init();

	jl_value_t* refs = jl_eval_string("refs = IdDict()");
	jl_function_t* setindex = jl_get_function(jl_base_module, "setindex!");
    jl_function_t* delete = jl_get_function(jl_base_module, "delete!");
	jl_datatype_t* reft = (jl_datatype_t*)jl_eval_string("Base.RefValue{Any}");

    jl_value_t *x = NULL;
    x = jl_eval_string("1.1");
    JL_GC_PUSH1(&x);
    jl_value_t* rvar = jl_new_struct(reft, x);
    JL_GC_POP();

    // Here we add a pointer to `x` in `refs`, which protects it from GC.
    jl_call3(setindex, refs, rvar, rvar);

    // This is how we remove the reference and let the variable to be freed by
    // the GC. The program segfaults if this line is uncommented.
    // jl_call2(delete, refs, rvar);

    jl_eval_string("GC.gc()");
    jl_call1(jl_eval_string("println"), x);

    jl_atexit_hook(0);

    return 0;
}
1 Like

Presumably this line could allocate, so you’re missing a root here — you need to root rvar prior to calling setindex!.

Note that it’s safe to root a reference to a null jl_value_t pointer, so it’s common to create a bunch of NULL pointers up front, rooting them all with a call to one of the JL_GC_PUSH* macros before they are assigned. Then assign them successively, safe in the knowledge that no matter the order of assignment all of these “slots” are properly rooted and the code using them can be safely rearranged. In this case you could do that with x and rvar.

Yes it could allocate and no you don’t need to root it. jl_call* does it for you.

Huh, true. Not what I expected from adding to the internals a bit here and there.

Looks like Jeff decided to do it this way back in 2013 “for extra robustness” for external API users (ee87968)