Hi everybody! I had CUDA.jl installed and nicely running past summer, but somehow goofed it up (driver/cuda update?). It is probably not directly related to CUDA.jl. I crawled dmesg for errors, rebooted and nvidia-smi works.
The problem
julia> versioninfo()
Julia Version 1.7.0-DEV.203
Commit b00e9f0bac (2020-12-31 06:59 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: AMD Ryzen 7 1700X Eight-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-11.0.0 (ORCJIT, znver1)
Environment:
JULIA_DEBUG = CUDA
julia> using CUDA
julia> CUDA.version()
┌ Debug: Initializing CUDA driver
└ @ CUDA ~/.julia/packages/CUDA/qSZa3/src/initialization.jl:88
┌ Error: Recursion during initialization of CUDA.jl
└ @ CUDA ~/.julia/packages/CUDA/qSZa3/src/initialization.jl:38
┌ Error: Error during initialization of CUDA.jl
│ exception =
│ CUDA error (code 999, CUDA_ERROR_UNKNOWN)
│ Stacktrace:
│ [1] throw_api_error(res::CUDA.cudaError_enum)
│ @ CUDA ~/.julia/packages/CUDA/qSZa3/lib/cudadrv/error.jl:97
│ [2] __configure__()
│ @ CUDA ~/.julia/packages/CUDA/qSZa3/src/initialization.jl:93
│ [3] macro expansion
│ @ ~/.julia/packages/CUDA/qSZa3/src/initialization.jl:30 [inlined]
│ [4] macro expansion
│ @ ./lock.jl:209 [inlined]
│ [5] _functional(show_reason::Bool)
│ @ CUDA ~/.julia/packages/CUDA/qSZa3/src/initialization.jl:26
│ [6] functional(show_reason::Bool)
│ @ CUDA ~/.julia/packages/CUDA/qSZa3/src/initialization.jl:19
│ [7] libcuda()
│ @ CUDA ~/.julia/packages/CUDA/qSZa3/src/initialization.jl:47
│ [8] macro expansion
│ @ ~/.julia/packages/CUDA/qSZa3/lib/cudadrv/libcuda.jl:23 [inlined]
│ [9] macro expansion
│ @ ~/.julia/packages/CUDA/qSZa3/lib/cudadrv/error.jl:102 [inlined]
│ [10] cuDriverGetVersion
│ @ ~/.julia/packages/CUDA/qSZa3/lib/utils/call.jl:26 [inlined]
│ [11] version()
│ @ CUDA ~/.julia/packages/CUDA/qSZa3/lib/cudadrv/version.jl:10
│ [12] top-level scope
│ @ REPL[2]:1
│ [13] eval(m::Module, e::Any)
│ @ Core ./boot.jl:369
│ [14] eval_user_input(ast::Any, backend::REPL.REPLBackend)
│ @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:139
│ [15] repl_backend_loop(backend::REPL.REPLBackend)
│ @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:200
│ [16] start_repl_backend(backend::REPL.REPLBackend, consumer::Any)
│ @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:185
│ [17] run_repl(repl::REPL.AbstractREPL, consumer::Any; backend_on_current_task::Bool)
│ @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:317
│ [18] run_repl(repl::REPL.AbstractREPL, consumer::Any)
│ @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/REPL/src/REPL.jl:305
│ [19] (::Base.var"#890#892"{Bool, Bool, Bool})(REPL::Module)
│ @ Base ./client.jl:394
│ [20] #invokelatest#2
│ @ ./essentials.jl:710 [inlined]
│ [21] invokelatest
│ @ ./essentials.jl:708 [inlined]
│ [22] run_main_repl(interactive::Bool, quiet::Bool, banner::Bool, history_file::Bool, color_set::Bool)
│ @ Base ./client.jl:379
│ [23] exec_options(opts::Base.JLOptions)
│ @ Base ./client.jl:309
│ [24] _start()
│ @ Base ./client.jl:492
└ @ CUDA ~/.julia/packages/CUDA/qSZa3/src/initialization.jl:34
ERROR: CUDA.jl did not successfully initialize, and is not usable.
If you did not see any other error message, try again in a new session
with the JULIA_DEBUG environment variable set to 'CUDA'.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] libcuda()
@ CUDA ~/.julia/packages/CUDA/qSZa3/src/initialization.jl:48
[3] macro expansion
@ ~/.julia/packages/CUDA/qSZa3/lib/cudadrv/libcuda.jl:23 [inlined]
[4] macro expansion
@ ~/.julia/packages/CUDA/qSZa3/lib/cudadrv/error.jl:102 [inlined]
[5] cuDriverGetVersion
@ ~/.julia/packages/CUDA/qSZa3/lib/utils/call.jl:26 [inlined]
[6] version()
@ CUDA ~/.julia/packages/CUDA/qSZa3/lib/cudadrv/version.jl:10
[7] top-level scope
@ REPL[2]:1
I’m using:
debian stretch
nvidia driver 450.80.02 from backports
system cuda installation 11.1 from backports
nvcc is at: /usr/bin/nvcc
libcuda.so is at: /usr/lib/i386-linux-gnu/nvidia/current/libcuda.so
I assume that CUDA.jl doesn’t download artifacts, because the driver version is unsupported (although cuda 11.1 should be binary-compatible to driver 450) - and it doesn’t find the local installation made with apt. Can you give me a hint what to do?