Strange bug in compiled binaries about SuiteSparse

I’m still trying to find a MWE. But let me describe some facts that I’ve figured out.
Our code, compiled on Dec 14th, works well. However, when compiled on Dec 23rd, the binary will output the following when running:

┌ Error: Error during initialization of module CHOLMOD
│   exception =
│    could not load library "libcholmod"
│    libcholmod.so: cannot open shared object file: No such file or directory
│    Stacktrace:
│     [1] dlopen(s::String, flags::UInt32; throw_error::Bool)
│       @ Base.Libc.Libdl ./libdl.jl:117
│     [2] dlopen (repeats 2 times)
│       @ ./libdl.jl:117 [inlined]
│     [3] __init__()
│       @ SuiteSparse.CHOLMOD ~/opt/hylanemos_1.3.0.0.17/share/julia/stdlib/v1.7/SuiteSparse/src/cholmod.jl:161
└ @ SuiteSparse.CHOLMOD /root/julia-1.7.1/share/julia/stdlib/v1.7/SuiteSparse/src/cholmod.jl:245

Yes, we compiled it with Julia 1.7.1, not current stable version 1.9.4 There are a few reasons:

  1. The most recent LTS of Julia is in fact 1.6.7, and we want to stick to LTS, because it sounds to be more stable. However, MKL.jl requires at least 1.7, so we use 1.7.1. We don’t want to change it unless a new LTS is released or some other dependencies require so.
  2. We heard that the memory usage of 1.9 is a bit larger than before.

Back to the main point. Comparing the two binaries we’ve gotten, we found the main difference occurs when PackageCompiler.jl boudles stdlibs. The outputs are the following, respectively:

├── Stdlibs:
  │   ├── LibCURL_jll
  │   │   ├── libcurl.so.4.7.0 - 638.750 KiB
  │   ├── LibGit2_jll
  │   │   ├── libgit2.so.1.1.0 - 1.467 MiB
  │   ├── LibSSH2_jll
  │   │   ├── libssh2.so.1.0.1 - 278.742 KiB
  │   ├── MbedTLS_jll
  │   │   ├── libmbedcrypto.so.2.24.0 - 586.603 KiB
  │   │   ├── libmbedtls.so.2.24.0 - 308.078 KiB
  │   │   ├── libmbedx509.so.2.24.0 - 179.938 KiB
  │   ├── OpenBLAS_jll
  │   │   ├── libopenblas64_.0.3.13.so - 30.227 MiB
  │   ├── SuiteSparse_jll
  │   │   ├── libamd.so.2.4.6 - 38.144 KiB
  │   │   ├── libbtf.so.1.2.6 - 12.801 KiB
  │   │   ├── libcamd.so.2.4.6 - 42.451 KiB
  │   │   ├── libccolamd.so.2.9.6 - 46.535 KiB
  │   │   ├── libcholmod.so.3.0.14 - 982.336 KiB
  │   │   ├── libcolamd.so.2.9.6 - 30.518 KiB
  │   │   ├── libklu.so.1.3.8 - 208.845 KiB
  │   │   ├── libldl.so.2.2.6 - 12.979 KiB
  │   │   ├── librbio.so.2.2.6 - 64.918 KiB
  │   │   ├── libspqr.so.2.0.9 - 202.683 KiB
  │   │   ├── libsuitesparseconfig.so.5.10.1 - 10.641 KiB
  │   │   ├── libumfpack.so.5.7.9 - 744.107 KiB
  │   ├── libblastrampoline_jll
  │   │   ├── libblastrampoline.so - 2.849 MiB
  │   ├── nghttp2_jll
  │   │   ├── libnghttp2.so.14.20.0 - 661.260 KiB

and

  ├── Stdlibs:
  │   ├── libblastrampoline_jll
  │   │   ├── libblastrampoline.so - 2.849 MiB
  │   ├── OpenBLAS_jll
  │   │   ├── libopenblas64_.0.3.13.so - 30.227 MiB
  │   ├── LibSSH2_jll
  │   │   ├── libssh2.so.1.0.1 - 278.742 KiB
  │   ├── LibCURL_jll
  │   │   ├── libcurl.so.4.7.0 - 638.750 KiB
  │   ├── MbedTLS_jll
  │   │   ├── libmbedcrypto.so.2.24.0 - 586.603 KiB
  │   │   ├── libmbedtls.so.2.24.0 - 308.078 KiB
  │   │   ├── libmbedx509.so.2.24.0 - 179.938 KiB
  │   ├── nghttp2_jll
  │   │   ├── libnghttp2.so.14.20.0 - 661.260 KiB

A week later, SuiteSparse_jll vanishes from the list, which results in the lack of libcholmod.
The problem is:

  1. we don’t use SuiteSparse in our code. There are no related things in our Manifest.toml. Why the compiled binary will ever need it and throw exceptions?
  2. Even if there is some magic resulting in SuiteSparse being used under the hood. What happens during the week, resulting in the difference?

For the second problem, we examined the output of PackageCompiler, and find that three packages have changed their version:

  1. ChainRulesCore 1.18.0 → 1.19.0
  2. Parsers 2.8.0 → 2.8.1
  3. StatisticArrays 1.7.0 → 1.8.1

But, after examining the change log of these three packages, we still cannot find the reason why SuiteSparse_jll is lost. Any ideas?

PS: We currently use the work around to directly declare SuiteSparse_jll as one of our dependencies.
PS2: SparseArrays will use SuiteSparse_jll in Julia 1.9, so it might not be a problem. But we have not tested it yet.

2 Likes

Hah, I just ran into the same issue with Julia 1.8.5, libcholmod.so is not bundled with the binary output of PackageCompiler.jl. Perhaps it’s not a Julia thing, maybe it’s a PackageCompiler thing…

Maybe its related to this issue? Dropping more libraries · Issue #865 · JuliaLang/PackageCompiler.jl · GitHub

Or this change? only bundle libraries that are needed (#823) · JuliaLang/PackageCompiler.jl@b4045fa · GitHub

After some further inspection, I start to think it is a change between Julia 1.7 and Julia 1.8 afterall. The .so files are part of the SuiteSparse_jll library dependencies. In Julia 1.7, this library, in turn, is a direct dependency of SuiteSparse, it appears in julia/stdlib/v1.7/SuiteSparse/gen/Project.toml. From there, it is picked up by PackageCompiler, I guess, in gather_stdlibs_project (?).

From Julia 1.8 up, however, something strange is going on, because SuiteSparse_jll disappeared from the Project.toml and Manifest.toml files. Since it’s no longer there, PackageCompiler doesn’t pick it up, and doesn’t bundle it with the binary.

Update: In Julia 1.9, it changed again.

@kristoffer.carlsson , care to comment?