Unable to find compatible target in system image

I have a multi-stage centos7-based dockerfile that has Julia 1.5.3 installed to /root/julia in the image base. PackageCompiler is v1.2.4 for this.

Here is the relevant part:

#################### cleaned_repo #####################

# This strips out the non-toml files in `cleaned_repo` to avoid changes unrelated to dependencies (this stage will always be hit).

FROM base AS cleaned_repo
LABEL description="Filter to only .toml files for building the sysimage"

RUN mkdir -p /root/myproj.jl

WORKDIR /root/myproj.jl

COPY . .

RUN find -mindepth 1 -maxdepth 1 -type f -not -name "*.toml" -exec rm -rf {} \;

RUN find -mindepth 2 -maxdepth 2 -not -name "*.toml" -not -name "bake_sysimage.jl" -exec rm -rf {} \;

###################### sysimage #######################

# This copies that skeletal repository into `sysimage` and builds the sysimage using the set of remote dependencies involved.
# If the remaining files have not changed (no dependency changes) then this stage will hit the cache and save ~35 minutes of build time.

FROM base AS sysimage
LABEL description="Builds sysimage from filtered repo"

RUN mkdir -p /root/myproj.jl

WORKDIR /root/myproj.jl

COPY --from=cleaned_repo /root/myproj.jl/. .

RUN source scl_source enable devtoolset-9 && julia scripts/bake_sysimage.jl

RUN rm -rf /root/myproj.jl

###################### prod_julia #####################

# Following that, it compiles the local code into an app using `incremental=true`

FROM sysimage AS prod_julia
LABEL description="Compiles julia code"

RUN mkdir -p /root/myproj.jl

WORKDIR /root/myproj.jl

COPY . .

RUN sh /root/scl_enable.sh julia scripts/compile_app.jl

The intent here is to avoid recompiling the dependencies (sysimage stage, >30 minutes) for every code change in MyProj.

Unfortunately, it fails on the prod_julia stage with the error:

#23 347.7 [ Info: PackageCompiler: creating system image object file, this might take a while...
#23 347.7 ERROR: Unable to find compatible target in system image.
#23 348.8 ERROR: LoadError: failed process: Process(`/root/julia/bin/julia --color=yes --startup-file=no '--cpu-target=generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)' --sysimage=/root/julia/lib/julia/sys.so --project=/root/myproj.jl/inner.jl --output-o=/tmp/jl_UGJKKB.o -e '...'`, ProcessSignaled(11))

I can replicate this locally using the sysimage stage even when swapping the '...' code for 'println("hi")'. However, I would expect the CPU target to be the same for both stages considering I don’t provide it and the builds happen on the same machine.

If I switch the code to 'println("hi")' and also remove the --cpu-target argument, then it gives me a segfault(11). If I also remove the --object-o argument, it prints “hi” as intended. Removing the --object-o argument while leaving the --cpu-target argument, causes the original error again.

This is the code that generates the original error

Maybe setting cpu_target = PackageCompiler.default_app_cpu_target() helps?

https://github.com/JuliaLang/PackageCompiler.jl/issues/441

Same error.

I will try it out forcing cpu_target=‘generic,-cx16’ for both to see what happens.

The first build stage spun for over an hour (much longer than previously). This does not seem like the right path forward either.

Error for the last one was:

#19 1406. [ Info: PackageCompiler: creating system image object file, this might take a while...
#19 1406. ERROR: Your CPU does not support the CX16 instruction, which is required by this version of Julia!  This is often due to running inside of a virtualized environment.  Please read https://docs.julialang.org/en/v1/devdocs/sysimg/ for more.
#19 1406. ERROR: LoadError: failed process: Process(`/root/julia/bin/julia --color=yes --startup-file=no --cpu-target=generic,-cx16 --sysimage=/root/julia/lib/julia/sys.so --project=/root/tmp --output-o=/tmp/jl_mNds7U.o -e 'Base.reinit_stdio()

In this case I have -cx16 in the code, so I’m a bit flummoxed.

Switching to fully generic for my sysimage worked. However, it was a fairly substantial performance hit, so I am back to this again on Julia 1.5.4 with PackageCompiler 1.2.5.

To refresh the problem:

  • I build my Sysimage (inside a docker container) on one CentOS7 VM and commit it to the registry
  • I pull my docker container to a different CentOS7 VM and try to run Julia
  • Julia crashes and core dumps with an “Unable to find compatible target in system image” error

It turns out the core dump caused when trying to access Julia with a “bad” sysimage is only 45 bytes. I cannot attach it, but here it is in hex:

1F8B0800B2C961600003EDC1010D000000C2A0F74F6D0E37A00000000000000000008037039ADE1D2700280000

Can someone read this and tell me what the incompatibility is?

Notes from before:

  • The same Julia sysimage works on a dozen other CentOS7 VMs.
  • -cx16 didn’t seem to fix anything previously

Looking into it more deeply, it builds (and works) on a server running a Cascade Lake Xeon and fails on a Broadwell Xeon.

I forgot that I opened this issue related to this so I’m going to put my further testing updates into it.