Unexpected recompilation

I will front-run this post with the disclaimer that it is going to be impossible to provide a MWE here, because if I could provide a MWE then I would be able to solve this problem.

I am using multiple dispatch and the setup is approximately

function _f(obj::ConcreteType1; k, w, a, r, g, s)
    @info "Hey, doing stuff!"
    # wow, more stuff
    @info "Yet another message!"
end

function f(id, json_str::String, conn::AbstractConnection)
    obj = JSON3.read(json_str, AbstractType) # nifty!
    @info "Doing a different thing!"
    _f(obj; k, w, a, r, g, s) # no globals, no indexing
end

The code prints “Doing a different thing” but doesn’t print “Hey, doing stuff!”

Things that might be of note:

  • The keyword arguments are untyped as in the example.
  • This works for multiple other concrete types, just not ConcreteType1.
  • It hangs in this specific case both on Jenkins and (thankfully) locally in the REPL.
  • I already checked whether there is a function redefinition, which would mean I am not using the function that I think I am, and there is not.
  • CPU sits at 100% for the process.

If I slam control-c, I can get a message like:

WARNING: Force throwing a SIGINT
Internal error: encountered unexpected error in runtime:
InterruptException()
top-level scope at REPL[20]:1
segv_handler at /buildworker/worker/package_linux64/build/src/signals-unix.c:235

, which just looks like a signal handler, or:

WARNING: Force throwing a SIGINT
Internal error: encountered unexpected error in runtime:
InterruptException()
top-level scope at REPL[22]:1
mprotect at /lib64/libc.so.6 (unknown line)
jl_safepoint_disable at /buildworker/worker/package_linux64/build/src/safepoint.c:85 [inlined]
jl_safepoint_defer_sigint at /buildworker/worker/package_linux64/build/src/safepoint.c:193
segv_handler at /buildworker/worker/package_linux64/build/src/signals-unix.c:245
_L_unlock_13 at /lib64/libpthread.so.0 (unknown line)
maybe_collect at /buildworker/worker/package_linux64/build/src/julia_threads.h:294 [inlined]
jl_gc_pool_alloc at /buildworker/worker/package_linux64/build/src/gc.c:1194
jl_gc_alloc_ at /buildworker/worker/package_linux64/build/src/julia_internal.h:277 [inlined]
...

which looks to me like a signal handled during compilation (hence the thread title).

After printing out the stack trace, if I stopped with just the right amount of control-c, it will continue running normally. Even if I hit it a few too many times, it will print out the @info "Hey, doing stuff!" message and a few subsequent info statements, as if the process is finally getting some attention before the kernel enforces the kill. I even put a sleep(5) after @info "Hey doing stuff!", anticipating that it might make it yield and clear the stdout buffer before hanging on a later command, but it behaved as before and just paused (to my count) 5 seconds between the messages. Something about throwing the SIGINT seems to be getting it un-stuck, albeit only until process death unless I get really lucky with the number of control-cs I hit.

Julia 1.5.4 with 8 threads on CentOS7. Also, I am using a custom sysimage containing all the packages I am not developing.

What is happening here?

This was a manifestation of this problem, so what was actually holding me up with an infinite loop within type inference.

1 Like