CUDA not working after today's update

Hi,

yesterday I read the message about the new CUDA package. Today I just did an update (julia 1.6.0 on ubuntu mate 20.04) and the CUDA package did upgrade. After that

julia> using CUDA

julia> CUDA.zeros(10)
┌ Error: Error during initialization of CUDA.jl
│ exception =
│ CUDA error: forward compatibility was attempted on non supported HW (code 804, ERROR_COMPAT_NOT_SUPPORTED_ON_DEVICE)
│ Stacktrace:
│ [1] throw_api_error(res::CUDA.cudaError_enum)
│ @ CUDA ~/.julia/packages/CUDA/MpASK/lib/cudadrv/error.jl:88
│ [2] runtime_init ()
│ @ CUDA ~/.julia/packages/CUDA/MpASK/src/initialization.jl:105
│ [3] macro expansion
│ @ ~/.julia/packages/CUDA/MpASK/src/initialization.jl:31 [inlined]
│ [4] macro expansion
│ @ ./lock.jl:209 [inlined]
│ [5] _functional(show_reason::Bool)
│ @ CUDA ~/.julia/packages/CUDA/MpASK/src/initialization.jl:27
│ [6] functional(show_reason::Bool)
│ @ CUDA ~/.julia/packages/CUDA/MpASK/src/initialization.jl:19
│ [7] libcuda()
│ @ CUDA ~/.julia/packages/CUDA/MpASK/src/initialization.jl:51
│ [8] macro expansion
│ @ ~/.julia/packages/CUDA/MpASK/lib/cudadrv/libcuda.jl:29 [inlined]
│ [9] macro expansion
│ @ ~/.julia/packages/CUDA/MpASK/lib/cudadrv/error.jl:94 [inlined]
│ [10] cuDeviceGet
│ @ ~/.julia/packages/CUDA/MpASK/lib/utils/call.jl:26 [inlined]
│ [11] CuDevice
│ @ ~/.julia/packages/CUDA/MpASK/lib/cudadrv/devices.jl:25 [inlined]
│ [12] CUDA.TaskLocalState()
│ @ CUDA ~/.julia/packages/CUDA/MpASK/src/state.jl:50
│ [13] task_local_state!()
│ @ CUDA ~/.julia/packages/CUDA/MpASK/src/state.jl:73
│ [14] stream
│ @ ~/.julia/packages/CUDA/MpASK/src/state.jl:419 [inlined]
│ [15] alloc
│ @ ~/.julia/packages/CUDA/MpASK/src/pool.jl:278 [inlined]
│ [16] CuArray{Float32, 1}(#unused#::UndefInitializer, dims::Tuple{Int64})
│ @ CUDA ~/.julia/packages/CUDA/MpASK/src/array.jl:20
│ [17] CuArray
│ @ ~/.julia/packages/CUDA/MpASK/src/array.jl:101 [inlined]
│ [18] CuArray
│ @ ~/.julia/packages/CUDA/MpASK/src/array.jl:102 [inlined]
│ [19] zeros
│ @ ~/.julia/packages/CUDA/MpASK/src/array.jl:376 [inlined]
│ [20] zeros(dims::Int64)
│ @ CUDA ~/.julia/packages/CUDA/MpASK/src/array.jl:378
│ [21] top-level scope
│ @ REPL[3]:1
│ [22] top-level scope
│ @ ~/.julia/packages/CUDA/MpASK/src/initialization.jl:81
│ [23] eval
│ @ ./boot.jl:360 [inlined]
│ [24] eval_user_input(ast::Any, backend::REPL.REPLBackend)
│ @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:139
│ [25] repl_backend_loop(backend::REPL.REPLBackend)
│ @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:200
│ [26] start_repl_backend(backend::REPL.REPLBackend, consumer::Any)
│ @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:185
│ [27] run_repl(repl::REPL.AbstractREPL, consumer::Any; backend_on_current_task::Bool)
│ @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:317
│ [28] run_repl(repl::REPL.AbstractREPL, consumer::Any)
│ @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:305
│ [29] (::Base.var"#874#876"{Bool, Bool, Bool})(REPL::Module)
│ @ Base ./client.jl:387
│ [30] #invokelatest#2
│ @ ./essentials.jl:708 [inlined]
│ [31] invokelatest
│ @ ./essentials.jl:706 [inlined]
│ [32] run_main_repl(interactive::Bool, quiet::Bool, banner::Bool, history_file::Bool, color_set::Bool)
│ @ Base ./client.jl:372
│ [33] exec_options(opts::Base.JLOptions)
│ @ Base ./client.jl:302
│ [34] _start()
│ @ Base ./client.jl:485
└ @ CUDA ~/.julia/packages/CUDA/MpASK/src/initialization.jl:34
ERROR: CUDA.jl did not successfully initialize, and is not usable.
If you did not see any other error message, try again in a new session
with the JULIA_DEBUG environment variable set to ‘CUDA’
etc…

Needless to say. it was perfectly working until the second before the update, nothing changed in my system. In fact I ran a julia code using CUDA and did the update after the run finished.

What can I do from here? And since I need to do calculations in my computer, can I revert to the previous CUDA version in the meantime? How?

Thanks a lot,

Ferran.

1 Like

Sorry forget it, already solved.

In case anybody else runs into this and finds this post: this happens when upgrading libcuda.so but not the NVIDIA driver (or failing to reboot and load the new driver). That kind of forwards compatibility (newer libcuda on older driver) is only supported by Tesla hardware.

1 Like