Hi I’m wondering if this is some kind of known issue, basically I have one simulation task for each Task, and I spawn 72 threads to execute these tasks, then my code throw a segment fault from the BLAS when I create 160 tasks using the Threads.@spawn
, I tried to find which task causes the segment fault exactly, so
- I manually split these 160 tasks into 80 tasks and another 80 tasks, I don’t get the segment fault anymore…
- then I use
--check-bounds=yes
check if there is anything out of bounds, but I don’t get the segment fault anymore either
The entire codebase is quite huge, thus I cannot find a small readable script to reproduce it, and since there is no error message from Julia (but from a BLAS function) I can’t figure out which is causing it,
any idea what I can use to trace this error further?
the segment fault trace looks like
signal (11): Segmentation fault
in expression starting at :1
unknown function (ip: 0x7fa766645941)
zgemv_n_SKYLAKEX at /home/ubuntu/packages/julias/julia-1.5/bin/../lib/julia/libopenblas64_.so (unknown line)
zgemv_64_ at /home/ubuntu/packages/julias/julia-1.5/bin/../lib/julia/libopenblas64_.so (unknown line)
gemv! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/blas.jl:626
gemv! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/matmul.jl:470
mul! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/matmul.jl:66 [inlined]
mul! at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/matmul.jl:208 [inlined]
#expv!#27 at /home/ubuntu/.julia/packages/ExponentialUtilities/XXu86/src/krylov_phiv.jl:118
I can always reproduce the segment fault by running the entire thing (all 160 tasks on 72 threads) without setting --check-bounds=yes