Hi all,
I was wondering if someone could help us track down a segfault in Julia/PythonCall? It has been hitting us for the past half year in GitHub actions, and has now started to hurt our library’s user base. It has been disruptive for all tools operating at the Python<->Julia interface (e.g.,), and our PySR/SymbolicRegression team has not been able to figure it out despite our best efforts.
The main JuliaLang issue is Segfault in `jl_object_id__cold` on Julia 1.11 · Issue #58171 · JuliaLang/julia · GitHub. This seems to occur most readily on Python 3.13. It can occur on both Julia 1.11 and 1.10. The problem boils down to initializing a bunch of objects wherein the Julia GC will segfault.
export PYTHON_JULIACALL_HANDLE_SIGNALS=yes
python -c '
from juliacall import Main as jl
x = [jl.randn(5) for _ in range(100000)]'
There are several reasons this has been a pain to track down, including:
- It has not been possible to reproduce this locally on Linux. However, it does rarely occur in CI on linux. It is most reproducible on macOS, and second most on Windows.
- It occurs randomly.
- It seems to occur most readily in low-memory systems such as GitHub action runners, presumably because the GC is more active.
- The number of Julia threads does not affect occurrence.
- The stack trace is random each time. According to @vchuravy the
jl_object_id__cold
only indicates that an object is not rooted in the GC. So the stack trace might not be helpful.
It has been recommended that we build Julia from source in “ASAN mode”. We have not been been successful in compiling this on macOS. There are not any pre-built binaries available for this either.
It was also recommended that we build Julia in debug mode and run under rr chaos mode (on Linux). We tried that but it didn’t reproduce the segfault.
When you do hit the error, the segfault will appear in a random form. Here are the ones I have seen in CI:
Example 1 (Julia 1.10; macos-latest; via `ijl_restore_package_image_from_file -> jl_table_assign_bp`)
jl_object_id__cold at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-10/src/builtins.c:455
ijl_object_id_ at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-10/src/builtins.c:472 [inlined]
jl_table_assign_bp at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-10/src/./iddict.c:47
ijl_idtable_rehash at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-10/src/./iddict.c:25 [inlined]
jl_table_assign_bp at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-10/src/./iddict.c:101
ijl_eqtable_put at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-10/src/./iddict.c:146
jl_as_global_root at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-10/src/staticdata.c:2361
jl_root_new_gvars at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-10/src/staticdata.c:2131 [inlined]
jl_restore_system_image_from_stream_ at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-10/src/staticdata.c:3337
jl_restore_package_image_from_stream at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-10/src/staticdata.c:3471
jl_restore_incremental_from_buf at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-10/src/staticdata.c:3522 [inlined]
ijl_restore_package_image_from_file at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-XC9YQX9HH2.0/build/default-honeycrisp-XC9YQX9HH2-0/julialang/julia-release-1-dot-10/src/staticdata.c:3606
_include_from_serialized at ./loading.jl:1117
#= truncated =#
run_mod at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
_PyRun_SimpleStringFlagsWithName at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
Py_RunMain at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
pymain_main at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
Py_BytesMain at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
Allocations: 1852181 (Pool: 1850203; Big: 1978); GC: 3
/Users/runner/work/_temp/410754f3-e90a-4c10-9411-9fe93f0cbed2.sh: line 3: 1467 Segmentation fault: 11 python -c 'import pysr'
Example 2 (Julia 1.11; macos-latest; via `ijl_compress_ir -> ... -> smallintset_rehash`)
jl_object_id__cold at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/builtins.c:441
smallintset_rehash at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/smallintset.c:218
jl_smallintset_insert at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/smallintset.c:197
jl_idset_put_idx at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/./idset.c:104
jl_as_global_root at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/staticdata.c:2548
jl_encode_as_indexed_root at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/ircode.c:108
jl_encode_value_ at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/ircode.c:444
jl_encode_memory_slice at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/ircode.c:135
jl_encode_value_ at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/ircode.c:403
ijl_compress_ir at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/ircode.c:866
maybe_compress_codeinfo at ./compiler/typeinfer.jl:394
#= truncated =#
PyEval_EvalCode at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
run_eval_code_obj at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
run_mod at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
_PyRun_SimpleStringFlagsWithName at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
Py_RunMain at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
pymain_main at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
Py_BytesMain at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
Allocations: 2882135 (Pool: 2881962; Big: 173); GC: 4
/Users/runner/work/_temp/e37b27f9-612d-4bd8-864c-753733e545f1.sh: line 3: 1496 Segmentation fault: 11 python -c 'import pysr'
Error: Process completed with exit code 139.
Example 3 (Julia 1.11; macos-latest; via `emit_expr -> ... -> smallintset_rehash`)
jl_object_id__cold at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/builtins.c:441
smallintset_rehash at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/smallintset.c:218
jl_smallintset_insert at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/smallintset.c:197
jl_idset_put_idx at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/./idset.c:104
jl_as_global_root at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/staticdata.c:2548
emit_expr at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/codegen.cpp:6151
emit_intrinsic at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/./intrinsics.cpp:1271 [inlined]
emit_call at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/codegen.cpp:5203
emit_expr at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/codegen.cpp:6201
emit_ssaval_assign at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-honeycrisp-R17H3W25T9.0/build/default-honeycrisp-R17H3W25T9-0/julialang/julia-release-1-dot-11/src/codegen.cpp:5747
#= truncated =#
pymain_run_module at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
Py_RunMain at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
pymain_main at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
Py_BytesMain at /Library/Frameworks/Python.framework/Versions/3.13/Python (unknown line)
Allocations: 2950021 (Pool: 2949848; Big: 173); GC: 4
/Users/runner/work/_temp/0f2d754b-1c4e-411c-a6ee-443612959c5a.sh: line 1: 9318 Segmentation fault: 11 python -m pysr test main,cli,startup
Error: Process completed with exit code 139.
I have also posted this to CPython as an issue here: Random segfaults on Python 3.12.10 during CI testing · Issue #134193 · python/cpython · GitHub. They seemed confident that the issue is not from Python.