AMDGPU.versioninfo() trips an assertion in AMD's code

System and Julia info, ./julia-1.10.8/bin/julia -g2 -e 'using InteractiveUtils; versioninfo()':

Julia Version 1.10.8
Commit 4c16ff44be8 (2025-01-22 10:06 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 Γ— AMD Ryzen 3 5300U with Radeon Graphics
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, znver2)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
Environment:
  JULIA_NUM_PRECOMPILE_TASKS = 4
  JULIA_PKG_PRECOMPILE_AUTO = 0

I decided to try out GPU programming (mainly to see if my packages support being compiled for GPU), so I installed AMDGPU.jl and did using AMDGPU, with Julia v1.10.8. This worked fine.

Thereafter, given that I don’t know if my GPU is supported, I ran AMDGPU.versioninfo(), however this call throws in AMD’s C++ code.

Running ./julia-1.10.8/bin/julia -g2 -e 'using AMDGPU; AMDGPU.versioninfo()':

[ Info: AMDGPU versioninfo
julia: /usr/src/debug/hip-runtime/clr-rocm-6.2.4/hipamd/src/hip_code_object.cpp:1152: hip::FatBinaryInfo** hip::StatCO::addFatBinary(const void*, bool): Assertion `err == hipSuccess' failed.

[4955] signal (6.-6): Aborted
in expression starting at none:1
unknown function (ip: 0x760ba7e6b334)
gsignal at /usr/lib/libc.so.6 (unknown line)
abort at /usr/lib/libc.so.6 (unknown line)
unknown function (ip: 0x760ba7df93de)
__assert_fail at /usr/lib/libc.so.6 (unknown line)
unknown function (ip: 0x760b67c50954)
unknown function (ip: 0x760a8f0ec8a8)
unknown function (ip: 0x760ba80005b6)
unknown function (ip: 0x760ba80006ac)
_dl_catch_exception at /lib64/ld-linux-x86-64.so.2 (unknown line)
unknown function (ip: 0x760ba80074fb)
_dl_catch_exception at /lib64/ld-linux-x86-64.so.2 (unknown line)
unknown function (ip: 0x760ba8007903)
unknown function (ip: 0x760ba7e64e03)
_dl_catch_exception at /lib64/ld-linux-x86-64.so.2 (unknown line)
unknown function (ip: 0x760ba7ffd678)
unknown function (ip: 0x760ba7e648e2)
dlopen at /usr/lib/libc.so.6 (unknown line)
ijl_load_dynamic_library at /cache/build/tester-amdci4-11/julialang/julia-master/src/dlload.c:365
jl_get_library_ at /cache/build/tester-amdci4-11/julialang/julia-master/src/runtime_ccall.cpp:46 [inlined]
jl_get_library_ at /cache/build/tester-amdci4-11/julialang/julia-master/src/runtime_ccall.cpp:30
ijl_lazy_load_and_lookup at /cache/build/tester-amdci4-11/julialang/julia-master/src/runtime_ccall.cpp:78
macro expansion at /home/nsajko/.julia/packages/AMDGPU/0tq5E/src/utils.jl:122 [inlined]
rocblas_get_version_string at /home/nsajko/.julia/packages/AMDGPU/0tq5E/src/blas/librocblas.jl:7631 [inlined]
rocblas_get_version_string at /home/nsajko/.julia/packages/AMDGPU/0tq5E/src/blas/rocBLAS.jl:24
version at /home/nsajko/.julia/packages/AMDGPU/0tq5E/src/blas/rocBLAS.jl:29
_ver at /home/nsajko/.julia/packages/AMDGPU/0tq5E/src/utils.jl:5 [inlined]
versioninfo at /home/nsajko/.julia/packages/AMDGPU/0tq5E/src/utils.jl:6
unknown function (ip: 0x760ba1507f52)
_jl_invoke at /cache/build/tester-amdci4-11/julialang/julia-master/src/gf.c:2895 [inlined]
ijl_apply_generic at /cache/build/tester-amdci4-11/julialang/julia-master/src/gf.c:3077
jl_apply at /cache/build/tester-amdci4-11/julialang/julia-master/src/julia.h:1982 [inlined]
do_call at /cache/build/tester-amdci4-11/julialang/julia-master/src/interpreter.c:126
eval_value at /cache/build/tester-amdci4-11/julialang/julia-master/src/interpreter.c:223
eval_stmt_value at /cache/build/tester-amdci4-11/julialang/julia-master/src/interpreter.c:174 [inlined]
eval_body at /cache/build/tester-amdci4-11/julialang/julia-master/src/interpreter.c:617
jl_interpret_toplevel_thunk at /cache/build/tester-amdci4-11/julialang/julia-master/src/interpreter.c:775
jl_toplevel_eval_flex at /cache/build/tester-amdci4-11/julialang/julia-master/src/toplevel.c:934
jl_toplevel_eval_flex at /cache/build/tester-amdci4-11/julialang/julia-master/src/toplevel.c:877
jl_toplevel_eval_flex at /cache/build/tester-amdci4-11/julialang/julia-master/src/toplevel.c:877
ijl_toplevel_eval_in at /cache/build/tester-amdci4-11/julialang/julia-master/src/toplevel.c:985
eval at ./boot.jl:385 [inlined]
exec_options at ./client.jl:296
_start at ./client.jl:557
jfptr__start_82923.1 at /home/nsajko/tmp/jl/jl/julia-1.10.8/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/tester-amdci4-11/julialang/julia-master/src/gf.c:2895 [inlined]
ijl_apply_generic at /cache/build/tester-amdci4-11/julialang/julia-master/src/gf.c:3077
jl_apply at /cache/build/tester-amdci4-11/julialang/julia-master/src/julia.h:1982 [inlined]
true_main at /cache/build/tester-amdci4-11/julialang/julia-master/src/jlapi.c:582
jl_repl_entrypoint at /cache/build/tester-amdci4-11/julialang/julia-master/src/jlapi.c:731
main at /cache/build/tester-amdci4-11/julialang/julia-master/cli/loader_exe.c:58
unknown function (ip: 0x760ba7dfae07)
__libc_start_main at /usr/lib/libc.so.6 (unknown line)
unknown function (ip: 0x4010b8)
Allocations: 3245613 (Pool: 3242991; Big: 2622); GC: 5
Aborted (core dumped)

Any ideas as to why this is failing? I guess it could be, e.g.:

  • a bug in one of the Julia packages
  • a bug in AMD’s code
  • I failed to install some relevant Archlinux package

EDIT: seems to be either an issue in AMD’s code or in the Archlinux packaging: Coredump on using AMDGPU Β· Issue #696 Β· JuliaGPU/AMDGPU.jl Β· GitHub

I wonder if anyone knows of a workaround, apart from downgrading the Arch packages?

(workaround)

If no better solution is available,
you might want to try running Julia+ROCm in Docker.
The main advantage is that it makes testing different ROCm and Julia combinations much simpler.

Here’s my sample Dockerfile that you can adapt:

# Set base ROCm version - https://hub.docker.com/r/rocm/dev-ubuntu-22.04/tags
#FROM rocm/dev-ubuntu-22.04:6.3.1-complete
FROM rocm/dev-ubuntu-22.04:6.2.4-complete
RUN set -eux ; \
    apt-get update \
    && apt-get install -yqq --no-install-suggests --no-install-recommends \
        build-essential \
        sudo \
        time \
        wget \
        zstd \
    \
    && rm -rf /var/lib/apt/lists/* /tmp/*
# https://github.com/ROCm/ROCm/discussions/2631
# Radeon 780M (gfx1103) 
ENV HSA_OVERRIDE_GFX_VERSION=11.0.3
# For Julia 1.11 set JULIA_LLVM_ARGS="-opaque-pointers" to enable opaque pointers and use system-wide device libraries, instead of patched from artifacts.
ENV JULIA_LLVM_ARGS="-opaque-pointers"
#Some settings
ENV JULIA_NUM_THREADS: 4
ENV JULIA_AMDGPU_CORE_MUST_LOAD: "1"
ENV JULIA_AMDGPU_HIP_MUST_LOAD: "1"
ENV JULIA_AMDGPU_DISABLE_ARTIFACTS: "1"
# add zen4 cpu target
ENV JULIA_CPU_TARGET="generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1);x86-64-v4,-rdrnd,base(1);znver4,-rdrnd,base(1)"
# set julia version
ENV JULIA_MAJOR=1.11
ENV JULIA_VERSION=1.11.3
ENV JULIA_SHA256=7d48da416c8cb45582a1285d60127ee31ef7092ded3ec594a9f2cf58431c07fd
ENV JULIA_DIR=/usr/local/julia
ENV JULIA_PATH=${JULIA_DIR}
ENV JULIA_DEPOT_PATH=${JULIA_PATH}/local/share/julia
# Install julia
RUN set -eux \
    && mkdir ${JULIA_DIR} \
    && cd /tmp  \
    && wget -q https://julialang-s3.julialang.org/bin/linux/x64/${JULIA_MAJOR}/julia-${JULIA_VERSION}-linux-x86_64.tar.gz \
    && echo "$JULIA_SHA256 julia-${JULIA_VERSION}-linux-x86_64.tar.gz" | sha256sum -c - \
    && tar xzf julia-${JULIA_VERSION}-linux-x86_64.tar.gz -C ${JULIA_DIR} --strip-components=1 \
    && rm /tmp/julia-${JULIA_VERSION}-linux-x86_64.tar.gz \
    && ln -fs ${JULIA_DIR}/bin/julia /usr/local/bin/julia \
    && julia -e 'using Pkg; Pkg.add(["AMDGPU", "CpuId", "CSV", "JSON3" ]);Pkg.precompile()' \
    && julia -e 'using CpuId, CSV, JSON3 ;' \
    && julia -e 'using InteractiveUtils; versioninfo()'

test

# build
docker build --progress=plain --network=host  -t julia-rocm .

# run 
alias drun='sudo docker run -it --network=host --device=/dev/kfd --device=/dev/dri --group-add=video --ipc=host --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --shm-size 8G -v $HOME/dockerx:/dockerx -w /dockerx'

drun julia-rocm

julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.11.3 (2025-01-21)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using AMDGPU; AMDGPU.versioninfo();
[ Info: AMDGPU versioninfo
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Available β”‚ Name             β”‚ Version   β”‚ Path                          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚     +     β”‚ LLD              β”‚ -         β”‚ /opt/rocm/llvm/bin/ld.lld     β”‚
β”‚     +     β”‚ Device Libraries β”‚ -         β”‚ /opt/rocm/amdgcn/bitcode      β”‚
β”‚     +     β”‚ HIP              β”‚ 6.2.41134 β”‚ /opt/rocm/lib/libamdhip64.so  β”‚
β”‚     +     β”‚ rocBLAS          β”‚ 4.2.4     β”‚ /opt/rocm/lib/librocblas.so   β”‚
β”‚     +     β”‚ rocSOLVER        β”‚ 3.26.2    β”‚ /opt/rocm/lib/librocsolver.so β”‚
β”‚     +     β”‚ rocSPARSE        β”‚ -         β”‚ /opt/rocm/lib/librocsparse.so β”‚
β”‚     +     β”‚ rocRAND          β”‚ 2.10.5    β”‚ /opt/rocm/lib/librocrand.so   β”‚
β”‚     +     β”‚ rocFFT           β”‚ 1.0.27    β”‚ /opt/rocm/lib/librocfft.so    β”‚
β”‚     +     β”‚ MIOpen           β”‚ 3.2.0     β”‚ /opt/rocm/lib/libMIOpen.so    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

[ Info: AMDGPU devices
β”Œβ”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Id β”‚                Name β”‚ GCN arch β”‚ Wavefront β”‚     Memory β”‚ Shared Memory β”‚
β”œβ”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  1 β”‚ AMD Radeon Graphics β”‚  gfx1103 β”‚        32 β”‚ 14.826 GiB β”‚    64.000 KiB β”‚
β””β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜


julia> x = ROCArray([1.0])
1-element ROCArray{Float64, 1, AMDGPU.Runtime.Mem.HIPBuffer}:
 1.0

1 Like