Understanding runtime errors with PackageCompiler built executables

I tried to build an application using PackageCompiler.build_executable. I can run the resulting executable with its --help option, but when it comes to deal with actual data processing, I get a “fatal: error thrown and no exception handler available.”:

fatal: error thrown and no exception handler available.
#<null>
rec_backtrace at /home/bli/src/julia/src/stackwalk.c:94
record_backtrace at /home/bli/src/julia/src/task.c:219 [inlined]
jl_throw at /home/bli/src/julia/src/task.c:429
check_channel_state at ./channels.jl:117 [inlined]
take_unbuffered at ./channels.jl:366
take! at ./channels.jl:344 [inlined]
iterate at ./channels.jl:410
iterate at ./channels.jl:409 [inlined]
#3 at /home/bli/src/qaf_demux/Julia/QafDemux/bin/qaf_demux.jl:104
make_record_writers at /home/bli/src/qaf_demux/Julia/QafDemux/src/QafDemux.jl:258
jl_apply_generic at /home/bli/src/julia/src/gf.c:2197
julia_main at /home/bli/src/qaf_demux/Julia/QafDemux/bin/qaf_demux.jl:101
julia_main at /home/bli/src/qaf_demux/Julia/QafDemux/deps/builddir/qaf_demux.so (unknown line)
main at ./deps/builddir/qaf_demux (unknown line)
__libc_start_main at /build/glibc-LK5gWL/glibc-2.23/csu/../csu/libc-start.c:291
_start at ./deps/builddir/qaf_demux (unknown line)

It happens when run on the same computer where the executable was built, still located in its deps/builddir with the necessary .so files.

When running directly the .jl file (actually, one with normal julia_main function instead of a Base.@ccallable function julia_main(ARGS::Vector{String})::Cint), there is no such failures.

Could someone help me understand what’s wrong? It’s my first julia program so I may have missed an obvious indication in the error message.

Well, actually, the plain non-compiled version did only work when the version of one of the dependencies was correctly set. I don’t know whether this is related, but after fixing this, the error upon running the compiled version changed, and seems to give a little bit more information about the cause of failure:

fatal: error thrown and no exception handler available.
MethodError(f=typeof(Base.convert)(), args=(Int32, nothing), world=0x00000000000066d2)
rec_backtrace at /home/bli/src/julia/src/stackwalk.c:94
record_backtrace at /home/bli/src/julia/src/task.c:219 [inlined]
jl_throw at /home/bli/src/julia/src/task.c:429
jl_method_error_bare at /home/bli/src/julia/src/gf.c:1606
jl_method_error at /home/bli/src/julia/src/gf.c:1624
jl_apply_generic at /home/bli/src/julia/src/gf.c:2161
julia_main at /home/bli/src/qaf_demux/Julia/QafDemux/bin/qaf_demux_to_compile.jl:102
julia_main at /home/bli/src/qaf_demux/Julia/QafDemux/deps/builddir/qaf_demux.so (unknown line)
main at ./deps/builddir/qaf_demux (unknown line)
__libc_start_main at /build/glibc-LK5gWL/glibc-2.23/csu/../csu/libc-start.c:291
_start at ./deps/builddir/qaf_demux (unknown line)

I do use convert explicitly in one function of the main library used by the program:

$ grep convert src/QafDemux.jl
qualnum2prob(qual::Int64) = qualnum2prob(convert(UInt8, qual))

I still don’t understrand how to fix the issue, though.

I think you may have to use plenty of println functions in your code base to figure out where it fails. PackageCompiler shouldn’t alter the code path when running julia executable.jl vs ./executable or .\executable.exe, so you should be able to debug this without compiling your code.

If you can share a minimal working example of this error it’ll be easier for people here to help with debugging.

I ended up fixing the issue thanks to a suggestion obtained at stackoverflow: https://stackoverflow.com/q/58105935/1878788

The error seems to be that my julia_main did not return an integer, contrary to what its declaration claimed: Base.@ccallable function julia_main(ARGS::Vector{String})::Cint )`.

And this is indeed in a piece of code not present in the “pure julia” version.

Well, it used to work 5 days ago, but today the compilation fails, even if I go back to the commit I did just after it had worked:

┌ Error: Error building `QafDemux`: 
│ [ Info: Smallest distance between barcodes: 3
│ ┌ Info: Fastq files have been written:
│ │   outfile_paths_dict =
│ │    Dict{String,String} with 13 entries:
│ │      "GCAGAGAGGAAT" => "test_run/GCAGAGAGGAAT.fastq.gz"
│ │      "GCAGAGAGAGAC" => "test_run/GCAGAGAGAGAC.fastq.gz"
│ │      "GCAGAGATGTTG" => "test_run/GCAGAGATGTTG.fastq.gz"
│ │      "GCAGAGACCAAC" => "test_run/GCAGAGACCAAC.fastq.gz"
│ │      "Undetermined" => "test_run/Undetermined.fastq.gz"
│ │      "GCAGAGAGGCTA" => "test_run/GCAGAGAGGCTA.fastq.gz"
│ │      "GCAGAGACAACT" => "test_run/GCAGAGACAACT.fastq.gz"
│ └      ⋮              => ⋮
│ ┌ Warning: Snoop file errored. Precompile statements were recorded untill error!
│ │   exception =
│ │    LoadError: MethodError: no method matching make_record_reader(::SubString{String}, ::Array{String,1}, ::getfield(QafDemux, Symbol("#seq_qual_extractor#7")){UnitRange{Int64}}, ::Bool)
│ │    Closest candidates are:
│ │      make_record_reader(!Matched::String, ::Array{String,1}, ::Any, ::Any) at /home/bli/src/qaf_demux/Julia/QafDemux/src/QafDemux.jl:216
│ │      make_record_reader(!Matched::String, ::Array{String,1}, ::Any) at /home/bli/src/qaf_demux/Julia/QafDemux/src/QafDemux.jl:216
│ │    in expression starting at /home/bli/src/qaf_demux/Julia/QafDemux/bin/snoop.jl:16
│ └ @ Main ~/.julia/packages/PackageCompiler/CJQcs/sysimg/run_julia_code.jl:7
│ ERROR: Unable to find compatible target in system image.
│ [ Info: used 266 out of 266 precompile statements
│ ERROR: LoadError: failed process: Process(`/home/bli/src/julia/usr/bin/julia --cpu-target=x86_64 --output-o=qaf_demux.a --track-allocation=none --code-coverage=none --inline=yes --math-mode=ieee --startup-file=no --compile=yes --track-allocation=none --sysimage-native-code=yes --sysimage=/home/bli/src/julia/usr/lib/julia/sys.so --compiled-modules=yes --optimize=0 /home/bli/.julia/packages/PackageCompiler/CJQcs/sysimg/run_julia_code.jl`, ProcessExited(1)) [1]
│ 
│ Stacktrace:
│  [1] pipeline_error at ./process.jl:813 [inlined]
│  [2] #run#536(::Bool, ::typeof(run), ::Cmd) at ./process.jl:728
│  [3] run at ./process.jl:726 [inlined]
│  [4] #run_julia#1 at /home/bli/.julia/packages/PackageCompiler/CJQcs/src/compiler_flags.jl:225 [inlined]
│  [5] #run_julia at ./none:0 [inlined]
│  [6] (::getfield(PackageCompiler, Symbol("##13#14")){Base.Iterators.Pairs{Symbol,Any,NTuple{14,Symbol},NamedTuple{(:sysimage, :startup_file, :handle_signals, :sysimage_native_code, :compiled_modules, :depwarn, :warn_overwrite, :compile, :cpu_target, :optimize, :debug_level, :inline, :check_bounds, :math_mode),Tuple{Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,String,Nothing,Nothing,Nothing,Nothing,Nothing}}},String})() at /home/bli/.julia/packages/PackageCompiler/CJQcs/src/static_julia.jl:263
│  [7] cd(::getfield(PackageCompiler, Symbol("##13#14")){Base.Iterators.Pairs{Symbol,Any,NTuple{14,Symbol},NamedTuple{(:sysimage, :startup_file, :handle_signals, :sysimage_native_code, :compiled_modules, :depwarn, :warn_overwrite, :compile, :cpu_target, :optimize, :debug_level, :inline, :check_bounds, :math_mode),Tuple{Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,String,Nothing,Nothing,Nothing,Nothing,Nothing}}},String}, ::String) at ./file.jl:96
│  [8] #build_object#12(::Base.Iterators.Pairs{Symbol,Any,NTuple{14,Symbol},NamedTuple{(:sysimage, :startup_file, :handle_signals, :sysimage_native_code, :compiled_modules, :depwarn, :warn_overwrite, :compile, :cpu_target, :optimize, :debug_level, :inline, :check_bounds, :math_mode),Tuple{Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,String,Nothing,Nothing,Nothing,Nothing,Nothing}}}, ::typeof(PackageCompiler.build_object), ::String, ::String, ::String, ::Bool) at /home/bli/.julia/packages/PackageCompiler/CJQcs/src/static_julia.jl:262
│  [9] #build_object at ./none:0 [inlined]
│  [10] build_object(::String, ::String, ::String, ::Bool, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::String, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing) at /home/bli/.julia/packages/PackageCompiler/CJQcs/src/static_julia.jl:241
│  [11] #static_julia#5(::Nothing, ::Bool, ::Bool, ::Nothing, ::String, ::String, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Nothing, ::Bool, ::Bool, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::String, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::Nothing, ::typeof(static_julia), ::String) at /home/bli/.julia/packages/PackageCompiler/CJQcs/src/static_julia.jl:162
│  [12] #static_julia at ./tuple.jl:0 [inlined]
│  [13] #build_executable#31 at /home/bli/.julia/packages/PackageCompiler/CJQcs/src/api.jl:104 [inlined]
│  [14] #build_executable at ./none:0 [inlined] (repeats 2 times)
│  [15] top-level scope at /home/bli/src/qaf_demux/Julia/QafDemux/deps/build.jl:14
│  [16] include at ./boot.jl:328 [inlined]
│  [17] include_relative(::Module, ::String) at ./loading.jl:1094
│  [18] include(::Module, ::String) at ./Base.jl:31
│  [19] include(::String) at ./client.jl:431
│  [20] top-level scope at none:5
│ in expression starting at /home/bli/src/qaf_demux/Julia/QafDemux/deps/build.jl:14
│ Building qaf_demux
│   Updating registry at `~/.julia/registries/BioJuliaRegistry`
│   Updating git-repo `https://github.com/BioJulia/BioJuliaRegistry.git`
    Updating registry at `~/.julia/registries/General`
│   Updating git-repo `https://github.com/JuliaRegistries/General.git`
   Resolving package versions...
│   Updating `/tmp/jl_BKXKwA/Project.toml`
│   [3fa0cd96] + REPL 
│   Updating `/tmp/jl_BKXKwA/Manifest.toml`
│   [2a0f44e3] ~ Base64  [`@stdlib/Base64`] ⇒ 
│   [ade2ca70] ~ Dates  [`@stdlib/Dates`] ⇒ 
│   [8bb1440f] ~ DelimitedFiles  [`@stdlib/DelimitedFiles`] ⇒ 
│   [8ba89e20] ~ Distributed  [`@stdlib/Distributed`] ⇒ 
│   [b77e0a4c] ~ InteractiveUtils  [`@stdlib/InteractiveUtils`] ⇒ 
│   [76f85450] ~ LibGit2  [`@stdlib/LibGit2`] ⇒ 
│   [8f399da3] ~ Libdl  [`@stdlib/Libdl`] ⇒ 
│   [37e2e46d] ~ LinearAlgebra  [`@stdlib/LinearAlgebra`] ⇒ 
│   [56ddb016] ~ Logging  [`@stdlib/Logging`] ⇒ 
│   [d6f4376e] ~ Markdown  [`@stdlib/Markdown`] ⇒ 
│   [a63ad114] ~ Mmap  [`@stdlib/Mmap`] ⇒ 
│   [de0858da] ~ Printf  [`@stdlib/Printf`] ⇒ 
│   [3fa0cd96] ~ REPL  [`@stdlib/REPL`] ⇒ 
│   [9a3f8284] ~ Random  [`@stdlib/Random`] ⇒ 
│   [ea8e919c] ~ SHA  [`@stdlib/SHA`] ⇒ 
│   [9e88b42a] ~ Serialization  [`@stdlib/Serialization`] ⇒ 
│   [1a1011a3] ~ SharedArrays  [`@stdlib/SharedArrays`] ⇒ 
│   [6462fe0b] ~ Sockets  [`@stdlib/Sockets`] ⇒ 
│   [2f01184e] ~ SparseArrays  [`@stdlib/SparseArrays`] ⇒ 
│   [10745b16] ~ Statistics  [`@stdlib/Statistics`] ⇒ 
│   [8dfed614] ~ Test  [`@stdlib/Test`] ⇒ 
│   [cf7118a7] ~ UUIDs  [`@stdlib/UUIDs`] ⇒ 
│   [4ec0a83e] ~ Unicode  [`@stdlib/Unicode`] ⇒ 
│  Resolving package versions...
│   Updating `/tmp/jl_BKXKwA/Project.toml`
│  [no changes]
│   Updating `/tmp/jl_BKXKwA/Manifest.toml`
│  [no changes]
│ Julia program file:
│   "/home/bli/src/qaf_demux/Julia/QafDemux/bin/qaf_demux_to_compile.jl"
│ C program file:
│   "/home/bli/.julia/packages/PackageCompiler/CJQcs/examples/program.c"
│ Build directory:
│   "/home/bli/src/qaf_demux/Julia/QafDemux/deps/builddir"
└ @ Pkg.Operations ~/src/julia/usr/share/julia/stdlib/v1.2/Pkg/src/backwards_compatible_isolation.jl:647

My make_record_reader has the following declaration:

function make_record_reader(fq_filename::String, barcodes::Vector{String}, seq_qual_extractor, func_style=true)

And is called as follows:

make_record_reader(fq_filename, barcodes, seq_qual_extractor, true)

Where seq_qual_extractor is generated by the following piece of code:

   if bc_start < 0
        # bc_range has to be computed for each read based on its length
        seq_qual_extractor = make_seq_qual_extractor(bc_start, bc_len)
    else
        bc_range = bc_start:bc_start+(bc_len-1)
        seq_qual_extractor = make_seq_qual_extractor(bc_range)
    end

So it is the result of either one of the following functions:

function make_seq_qual_extractor(bc_range::UnitRange{Int})
    function seq_qual_extractor(record::fq.Record)::Tuple{String,Vector{UInt8}}
        subseq = fq.sequence(String, record, bc_range)
        quals = fq.quality(record, :illumina18, bc_range)
        return (subseq, quals)
    end
end


function make_seq_qual_extractor(bc_start::Int, bc_len::Int)
    @assert bc_start < 0 "*bc_start* should be negative."
    function seq_qual_extractor(record::fq.Record)::Tuple{String,Vector{UInt8}}
        seq = fq.sequence(String, record)
        seq_len = length(seq)
        real_start = seq_len + bc_start + 1
        bc_range = real_start:real_start+(bc_len-1)
        quals = fq.quality(record, :illumina18, bc_range)
        return (seq[bc_range], quals)
    end
end

It turns out that on this machine, I use a locally-compiled Julia, and that the failure to compile was due to the use of cpu_target="x86_64"

$ cat deps/build.jl
import Pkg
println("Building qaf_demux")
Pkg.add("REPL")
Pkg.add("PackageCompiler")
#push!(LOAD_PATH, abspath(joinpath(@__DIR__, "../src/")))
using PackageCompiler
# setting cpu_target does not work with a self-built Julia: https://github.com/NHDaly/ApplicationBuilder.jl/issues/62#issuecomment-503721859
# Compile:
build_executable(joinpath(@__DIR__, "../bin/qaf_demux_to_compile.jl"), "qaf_demux", snoopfile=joinpath(@__DIR__, "../bin/snoop.jl"))
# Does not compile:
# build_executable(joinpath(@__DIR__, "../bin/qaf_demux_to_compile.jl"), "qaf_demux", snoopfile=joinpath(@__DIR__, "../bin/snoop.jl"), cpu_target="x86_64")

How come this generates the kind of compile error above ?!

I think I came up with another possible cause for the first runtime error case (the one involving check_channel_state at ./channels.jl:117 [inlined]): I just came up with a case where I get it.

The case is as follows: The test data on which I ran the compiled application was not the same as the one on which I ran the plain .jl code.

I ran the plain .jl code during the %post phase of the build of a singularity container. In that environment, the test data was correctly cloned together with the git repository containing the source code, because I took care of installing git-lfs as part of the set up of the container.

On the contrary, I ran the built executable on my workstation where julia is installed, but were the local clone of the git repository did not contain the “real” test data, but things like that:

version https://git-lfs.github.com/spec/v1
oid sha256:31b353b795b7dd2eb59e71e5c6b76c89452b2dc80f5206058da9985bb321f218
size 2134706

Which are likely something informing git-lfs about the real data, but not the real data itself.

There are so many options to get bitten when trying to containerize a compiled julia application with test data under git-lfs ! I still have a long way to go before my workflows are reproducibility-ready.