CUDA profiling tools crashing

Per the docs, I’m trying to get some extra insight into my kernels using some of CUDA’s profiling tools.

I’ve tried running

$ nsys launch julia

along with

$ ncu --mode=launch julia

but both commands crash with the following:

fatal: error thrown and no exception handler available.
InitError(mod=:Sys, error=ErrorException("PCRE compilation error: unrecognised compile-time option bit(s) at offset 0"))
error at ./error.jl:35
compile at ./pcre.jl:165
compile at ./regex.jl:75
#occursin#500 at ./regex.jl:258 [inlined]
occursin at ./regex.jl:257 [inlined]
isdirpath at ./path.jl:117 [inlined]
normpath at ./path.jl:373
abspath at ./path.jl:440
abspath at ./path.jl:449 [inlined]
__init_build at ./sysinfo.jl:128
__init__ at ./sysinfo.jl:120
jfptr___init___51927.clone_1 at /home/alechammond/install/julia-1.9.2/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/julia.h:1879 [inlined]
jl_module_run_initializer at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/toplevel.c:75
_finish_julia_init at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/init.c:855
julia_init at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/init.c:804
jl_repl_entrypoint at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/src/jlapi.c:711
main at /cache/build/default-amdci5-2/julialang/julia-release-1-dot-9/cli/loader_exe.c:59
__libc_start_main at /lib64/libc.so.6 (unknown line)
unknown function (ip: 0x401098)

I’m running Julia 1.9.2 on linux (using the distributed binaries). Here is my Julia CUDA configuration:

julia> using CUDA

julia> CUDA.versioninfo()
CUDA runtime 12.0, local installation
CUDA driver 12.1
NVIDIA driver 525.105.17, originally for CUDA 12.0

Libraries: 
- CUBLAS: 12.0.2
- CURAND: 10.3.1
- CUFFT: 11.0.1
- CUSOLVER: 11.4.3
- CUSPARSE: 12.0.1
- CUPTI: 18.0.0
- NVML: 12.0.0+525.105.17

Toolchain:
- Julia: 1.9.2
- LLVM: 14.0.6
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5
- Device capability support: sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

Environment:
- JULIA_CUDA_MEMORY_POOL: none

4 devices:
  0: NVIDIA H100 (sm_90, 95.037 GiB / 95.577 GiB available)
  1: NVIDIA H100 (sm_90, 95.037 GiB / 95.577 GiB available)
  2: NVIDIA H100 (sm_90, 95.037 GiB / 95.577 GiB available)
  3: NVIDIA H100 (sm_90, 95.037 GiB / 95.577 GiB available)

julia> 

Anybody have suggestions on how I can go about debugging this?

It’s a bug in NSight. Try launching as follows:

LD_LIBRARY_PATH=$(/path/to/julia -e 'println(joinpath(Sys.BINDIR, Base.LIBDIR, "julia"))') ncu --mode=launch /path/to/julia

@maleadt that works, thanks! Clever trick to populate the LD_LIBRARY_PATH