Naively maybe, I assumed that segfaults will only occur if I access memory I don’t own. Therefore, in Julia, with bounds checking turned on, it simply shouldn’t occur?
I’ve now run into a case where the opposite happens. I don’t know where to start debugging. I can’t even judge whether this is a problem in my code or in a library I’m using (if the latter, then likely Zygote) or in Julia itself. More on my code below (I apologize that I was unable to produce a MWE.), but here are the observations for now, all with Julia 1.7.1, on an M1, compiled from source. (It appears it does not occur with downloaded versions of 1.6.4, 1.7.0, 1.7.1 - rosetta; but it does occur with a downloaded version of Julia 1.7.1 - native M1.)
The script tests some rrule
implementations by constructing some simple nonlinear models, differentiating them with Zygote (using the custom rrule
implementations) and then checking them with finite differences. The following tests all run ok with all tests passing, and no segfaults:
julia --project=.. test_admodel.jl
julia --project=.. --check-bounds=yes test_admodel.jl
julia -O3 --project=.. test_admodel.jl
But if I turn on bounds checking AND O3, I get a segfault in one of the tests:
julia -O3 --project=.. --check-bounds=yes test_admodel.jl
NOTE: After the last PR, this has actually changed - but can still reproduce the same behaviour if test_admodel.jl is replaced with runtests.jl. So it is not entirely reproducible, but I guess this is not unusual for segfaults; small changes to the code can change the behaviour?
I would be very grateful for any general thoughts how segfaults might occur when all bounds-checking is turned on, or directions on where to best file this as a bug report.
Full segfault report:
signal (11): Segmentation fault: 11
in expression starting at /Users/ortner/gits/ACE.jl/test/test_admodel.jl:97
ntuple at ./ntuple.jl:0
unknown function (ip: 0x10e05e667)
_jl_invoke at /Users/ortner/gits/julia17/src/gf.c:0 [inlined]
jl_apply_generic at /Users/ortner/gits/julia17/src/gf.c:2429
getindex at ./range.jl:373
jfptr_getindex_30379 at /Users/ortner/gits/julia17/usr/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/ortner/gits/julia17/src/gf.c:0 [inlined]
jl_apply_generic at /Users/ortner/gits/julia17/src/gf.c:2429
#s106#1265 at ./compiler/interface2.jl:13 [inlined]
#s106#1265 at ./none:0
_jl_invoke at /Users/ortner/gits/julia17/src/gf.c:0 [inlined]
jl_apply_generic at /Users/ortner/gits/julia17/src/gf.c:2429
GeneratedFunctionStub at ./boot.jl:580
_jl_invoke at /Users/ortner/gits/julia17/src/gf.c:0 [inlined]
jl_apply_generic at /Users/ortner/gits/julia17/src/gf.c:2429
jl_apply at /Users/ortner/gits/julia17/src/./julia.h:1788 [inlined]
jl_call_staged at /Users/ortner/gits/julia17/src/method.c:431
jl_code_for_staged at /Users/ortner/gits/julia17/src/method.c:482
get_staged at ./compiler/utilities.jl:111
retrieve_code_info at ./compiler/utilities.jl:123 [inlined]
InferenceState at ./compiler/inferencestate.jl:234
typeinf_edge at ./compiler/typeinfer.jl:814 [inlined]
abstract_call_method at ./compiler/abstractinterpretation.jl:504
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:105
abstract_call_known at ./compiler/abstractinterpretation.jl:1342
abstract_call at ./compiler/abstractinterpretation.jl:1397
abstract_call at ./compiler/abstractinterpretation.jl:1382
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1534
typeinf_local at ./compiler/abstractinterpretation.jl:1918
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2014
_typeinf at ./compiler/typeinfer.jl:226
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:823 [inlined]
abstract_call_method at ./compiler/abstractinterpretation.jl:504
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:105
abstract_call_known at ./compiler/abstractinterpretation.jl:1342
abstract_call at ./compiler/abstractinterpretation.jl:1397
abstract_call at ./compiler/abstractinterpretation.jl:1382
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1534
typeinf_local at ./compiler/abstractinterpretation.jl:1918
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2014
_typeinf at ./compiler/typeinfer.jl:226
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:823 [inlined]
abstract_call_method at ./compiler/abstractinterpretation.jl:504
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:105
abstract_call_known at ./compiler/abstractinterpretation.jl:1342
abstract_call at ./compiler/abstractinterpretation.jl:1397
abstract_call at ./compiler/abstractinterpretation.jl:1382
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1534
typeinf_local at ./compiler/abstractinterpretation.jl:1918
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2014
_typeinf at ./compiler/typeinfer.jl:226
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:823 [inlined]
abstract_call_method at ./compiler/abstractinterpretation.jl:504
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:105
abstract_call_known at ./compiler/abstractinterpretation.jl:1342
abstract_call at ./compiler/abstractinterpretation.jl:1397
abstract_apply at ./compiler/abstractinterpretation.jl:987
abstract_call_known at ./compiler/abstractinterpretation.jl:1249
abstract_call at ./compiler/abstractinterpretation.jl:1397
abstract_call at ./compiler/abstractinterpretation.jl:1382
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1534
typeinf_local at ./compiler/abstractinterpretation.jl:1918
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2014
_typeinf at ./compiler/typeinfer.jl:226
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:823 [inlined]
abstract_call_method at ./compiler/abstractinterpretation.jl:504
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:105
abstract_call at ./compiler/abstractinterpretation.jl:1395
abstract_call at ./compiler/abstractinterpretation.jl:1382
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1534
typeinf_local at ./compiler/abstractinterpretation.jl:1918
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2014
_typeinf at ./compiler/typeinfer.jl:226
typeinf at ./compiler/typeinfer.jl:209
typeinf_edge at ./compiler/typeinfer.jl:823 [inlined]
abstract_call_method at ./compiler/abstractinterpretation.jl:504
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:105
abstract_call_known at ./compiler/abstractinterpretation.jl:1342
abstract_call at ./compiler/abstractinterpretation.jl:1397
abstract_apply at ./compiler/abstractinterpretation.jl:987
abstract_call_known at ./compiler/abstractinterpretation.jl:1249
abstract_call at ./compiler/abstractinterpretation.jl:1397
abstract_call at ./compiler/abstractinterpretation.jl:1382
abstract_eval_statement at ./compiler/abstractinterpretation.jl:1534
typeinf_local at ./compiler/abstractinterpretation.jl:1900
typeinf_nocycle at ./compiler/abstractinterpretation.jl:2014
_typeinf at ./compiler/typeinfer.jl:226
typeinf at ./compiler/typeinfer.jl:209
typeinf_ext at ./compiler/typeinfer.jl:909
typeinf_ext_toplevel at ./compiler/typeinfer.jl:942
typeinf_ext_toplevel at ./compiler/typeinfer.jl:938
jfptr_typeinf_ext_toplevel_15657 at /Users/ortner/gits/julia17/usr/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/ortner/gits/julia17/src/gf.c:0 [inlined]
jl_apply_generic at /Users/ortner/gits/julia17/src/gf.c:2429
jl_apply at /Users/ortner/gits/julia17/src/./julia.h:1788 [inlined]
jl_type_infer at /Users/ortner/gits/julia17/src/gf.c:295
jl_generate_fptr at /Users/ortner/gits/julia17/src/jitlayers.cpp:338
jl_compile_method_internal at /Users/ortner/gits/julia17/src/gf.c:1980
_jl_invoke at /Users/ortner/gits/julia17/src/gf.c:2239 [inlined]
jl_apply_generic at /Users/ortner/gits/julia17/src/gf.c:2429
adjoint at /Users/ortner/.julia/packages/Zygote/rv6db/src/lib/array.jl:211 [inlined]
_pullback at /Users/ortner/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:65 [inlined]
_pullback at /Users/ortner/.julia/packages/Zygote/rv6db/src/lib/broadcast.jl:193 [inlined]
_pullback at ./compiler/interface2.jl:0
_pullback at /Users/ortner/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
_pullback at ./compiler/interface2.jl:0
_pullback at /Users/ortner/.julia/packages/Zygote/rv6db/src/lib/lib.jl:203 [inlined]
_pullback at ./compiler/interface2.jl:0
_pullback at /Users/ortner/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
_pullback at ./compiler/interface2.jl:0
_pullback at ./broadcast.jl:1303 [inlined]
_pullback at /Users/ortner/gits/ACE.jl/test/test_admodel.jl:40 [inlined]
_pullback at ./compiler/interface2.jl:0
_pullback at /Users/ortner/gits/ACE.jl/test/test_admodel.jl:87 [inlined]
_pullback at ./compiler/interface2.jl:0
_pullback at /Users/ortner/gits/ACE.jl/test/test_admodel.jl:88 [inlined]
_pullback at ./compiler/interface2.jl:0
_pullback at /Users/ortner/.julia/packages/Zygote/rv6db/src/compiler/interface.jl:41 [inlined]
_pullback at ./compiler/interface2.jl:0
unknown function (ip: 0x10e05ad23)
_jl_invoke at /Users/ortner/gits/julia17/src/gf.c:0 [inlined]
jl_apply_generic at /Users/ortner/gits/julia17/src/gf.c:2429
_pullback at /Users/ortner/.julia/packages/Zygote/rv6db/src/compiler/interface.jl:76 [inlined]
_pullback at ./compiler/interface2.jl:0
_pullback at /Users/ortner/gits/ACE.jl/test/test_admodel.jl:88 [inlined]
_pullback at ./compiler/interface2.jl:0
unknown function (ip: 0x10dfdd707)
_jl_invoke at /Users/ortner/gits/julia17/src/gf.c:0 [inlined]
jl_apply_generic at /Users/ortner/gits/julia17/src/gf.c:2429
_pullback at /Users/ortner/gits/ACE.jl/test/test_admodel.jl:91 [inlined]
_pullback at ./compiler/interface2.jl:0
unknown function (ip: 0x10dfcda8b)
_jl_invoke at /Users/ortner/gits/julia17/src/gf.c:0 [inlined]
jl_apply_generic at /Users/ortner/gits/julia17/src/gf.c:2429
_pullback at /Users/ortner/.julia/packages/Zygote/rv6db/src/compiler/interface.jl:34
pullback at /Users/ortner/.julia/packages/Zygote/rv6db/src/compiler/interface.jl:40
gradient at /Users/ortner/.julia/packages/Zygote/rv6db/src/compiler/interface.jl:75
unknown function (ip: 0x10df5d94f)
_jl_invoke at /Users/ortner/gits/julia17/src/gf.c:0 [inlined]
jl_apply_generic at /Users/ortner/gits/julia17/src/gf.c:2429
jl_apply at /Users/ortner/gits/julia17/src/./julia.h:1788 [inlined]
do_call at /Users/ortner/gits/julia17/src/interpreter.c:126
eval_body at /Users/ortner/gits/julia17/src/interpreter.c:0
jl_interpret_toplevel_thunk at /Users/ortner/gits/julia17/src/interpreter.c:731
jl_toplevel_eval_flex at /Users/ortner/gits/julia17/src/toplevel.c:885
jl_toplevel_eval_flex at /Users/ortner/gits/julia17/src/toplevel.c:830
jl_toplevel_eval at /Users/ortner/gits/julia17/src/toplevel.c:894 [inlined]
jl_toplevel_eval_in at /Users/ortner/gits/julia17/src/toplevel.c:944
eval at ./boot.jl:373 [inlined]
include_string at ./loading.jl:1196
_jl_invoke at /Users/ortner/gits/julia17/src/gf.c:0 [inlined]
jl_apply_generic at /Users/ortner/gits/julia17/src/gf.c:2429
_include at ./loading.jl:1253
include at ./Base.jl:418
_jl_invoke at /Users/ortner/gits/julia17/src/gf.c:0 [inlined]
jl_apply_generic at /Users/ortner/gits/julia17/src/gf.c:2429
exec_options at ./client.jl:292
_start at ./client.jl:495
jl_sysimg_fvars_base at /Users/ortner/gits/julia17/usr/lib/julia/sys.dylib (unknown line)
_jl_invoke at /Users/ortner/gits/julia17/src/gf.c:0 [inlined]
jl_apply_generic at /Users/ortner/gits/julia17/src/gf.c:2429
jl_apply at /Users/ortner/gits/julia17/src/./julia.h:1788 [inlined]
true_main at /Users/ortner/gits/julia17/src/jlapi.c:559
jl_repl_entrypoint at /Users/ortner/gits/julia17/src/jlapi.c:701
Allocations: 128005306 (Pool: 127966720; Big: 38586); GC: 151
Just in case anybody is actually interested in reproducing this problem, here are the instructions:
Setup and run the tests:
Note these instructions are more complicated than necessary but avoid having to install a custom registry.
git clone git@github.com:ACEsuit/ACEbase.jl.git
cd ACEbase.jl
git checkout 3127b688f41cdd14247f8db5c8e4f531e31dc761
cd ..
git clone git@github.com:ACEsuit/ACE.jl.git
cd ACE.jl
git checkout 35726ccdc0602fb11e9f1b2efd7c6c8ff31a11ce
julia --project=. -e 'using Pkg; Pkg.develop(path = "../ACEbase.jl")'
cd test
julia -O3 --project=.. --check-bounds=yes test_admodel.jl
Or replace the last line with
julia -O3 --project=.. --check-bounds=yes runtests.jl