LoadError: CUDA runtime not found

I am trying to implement a couple of algorithms in CUDA using Julia and the CUDA.jl package. I started with Julia a couple of days ago, so the error may be easy to solve, but I haven’t figured out a solution yet.

The Error I’m getting is this:

┌ Error: No CUDA Runtime library found. This can have several reasons:
│ * you are using an unsupported platform: CUDA.jl only supports Linux (x86_64, aarch64, ppc64le), and Windows (x86_64).
│   refer to the documentation for instructions on how to use a custom CUDA runtime.
│ * you precompiled CUDA.jl in an environment where the CUDA driver was not available.
│   in that case, you need to specify (during pre compilation) which version of CUDA to use.
│   refer to the documentation for instructions on how to use `CUDA.set_runtime_version!`.
│ * you requested use of a local CUDA toolkit, but not all components were discovered.
│   try running with JULIA_DEBUG=CUDA_Runtime_Discovery for more information.
└ @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/initialization.jl:77
ERROR: LoadError: CUDA runtime not found
Stacktrace:
 [1] functional(show_reason::Bool)
   @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/initialization.jl:24
 [2] top-level scope
   @ .../main.jl:7
in expression starting at .../main.jl:7

Running status in the REPL package manager gives this:

(@v1.8) pkg> status
Status `~/.julia/environments/v1.8/Project.toml`
  [052768ef] CUDA v4.0.1

I have tried using using Pkg; Pkg.add("CUDA") in the start of the file as well, but still receiving the same error message.
I am running a CUDA capable GPU (MX330), and running CUDA C works fine.
The output of the environment variable seems correct:

julia> ENV["CUDA_PATH"]
"/opt/cuda"

I use Manjaro Linux, (with the proprietary nvidia driver), and the cuda-linux-packages is located at /opt/cuda

If some information is missing, please tell. :slight_smile:

Please run with JULIA_DEBUG=all and post the output.

Thanks for the quick reply, here is the output:

JULIA_DEBUG=all julia main.jl                                 ✔ 
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/Preferences/pWSk8_xFL6F.ji for Preferences [21216c6a-2e73-6563-6e65-726566657250]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/JLLWrappers/7Zgw7_xFL6F.ji for JLLWrappers [692b3bcd-3c85-4b1f-b108-f13ce0eb3210]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/CUDA_Driver_jll/QJyk7_xFL6F.ji for CUDA_Driver_jll [4ee394cb-3365-5eb0-8335-949819d2adfc]
└ @ Base loading.jl:806
┌ Debug: System CUDA driver found at libcuda.so.1, detected as version 12.0.0
└ @ CUDA_Driver_jll ~/.julia/packages/CUDA_Driver_jll/9E4Mc/src/wrappers/x86_64-linux-gnu.jl:76
┌ Debug: System CUDA driver is recent enough; not using forward-compatible driver
└ @ CUDA_Driver_jll ~/.julia/packages/CUDA_Driver_jll/9E4Mc/src/wrappers/x86_64-linux-gnu.jl:94
  No Changes to `~/.julia/environments/v1.8/Project.toml`
  No Changes to `~/.julia/environments/v1.8/Manifest.toml`
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/Preferences/pWSk8_xFL6F.ji for Preferences [21216c6a-2e73-6563-6e65-726566657250]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/JLLWrappers/7Zgw7_xFL6F.ji for JLLWrappers [692b3bcd-3c85-4b1f-b108-f13ce0eb3210]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/CUDA_Driver_jll/QJyk7_xFL6F.ji for CUDA_Driver_jll [4ee394cb-3365-5eb0-8335-949819d2adfc]
└ @ Base loading.jl:806
┌ Debug: System CUDA driver found at libcuda.so.1, detected as version 12.0.0
└ @ CUDA_Driver_jll ~/.julia/packages/CUDA_Driver_jll/9E4Mc/src/wrappers/x86_64-linux-gnu.jl:76
┌ Debug: System CUDA driver is recent enough; not using forward-compatible driver
└ @ CUDA_Driver_jll ~/.julia/packages/CUDA_Driver_jll/9E4Mc/src/wrappers/x86_64-linux-gnu.jl:94
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/Preferences/pWSk8_xFL6F.ji for Preferences [21216c6a-2e73-6563-6e65-726566657250]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/JLLWrappers/7Zgw7_xFL6F.ji for JLLWrappers [692b3bcd-3c85-4b1f-b108-f13ce0eb3210]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/CUDA_Driver_jll/QJyk7_xFL6F.ji for CUDA_Driver_jll [4ee394cb-3365-5eb0-8335-949819d2adfc]
└ @ Base loading.jl:806
┌ Debug: System CUDA driver found at libcuda.so.1, detected as version 12.0.0
└ @ CUDA_Driver_jll ~/.julia/packages/CUDA_Driver_jll/9E4Mc/src/wrappers/x86_64-linux-gnu.jl:76
┌ Debug: System CUDA driver is recent enough; not using forward-compatible driver
└ @ CUDA_Driver_jll ~/.julia/packages/CUDA_Driver_jll/9E4Mc/src/wrappers/x86_64-linux-gnu.jl:94
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/CEnum/0gyUJ_xFL6F.ji for CEnum [fa961155-64e5-5f13-b03f-caf6b980ea82]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/Preferences/pWSk8_xFL6F.ji for Preferences [21216c6a-2e73-6563-6e65-726566657250]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/JLLWrappers/7Zgw7_xFL6F.ji for JLLWrappers [692b3bcd-3c85-4b1f-b108-f13ce0eb3210]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/LLVMExtra_jll/R9OeX_xFL6F.ji for LLVMExtra_jll [dad2f222-ce93-54a1-a47d-0025e8a3acab]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/LLVM/e8NBy_xFL6F.ji for LLVM [929cbde3-209d-540e-8aea-75f648917ca0]
└ @ Base loading.jl:806
┌ Debug: Using LLVM 14.0.6 at /usr/bin/../lib/julia/../libLLVM-14.so
└ @ LLVM ~/.julia/packages/LLVM/s3bxG/src/LLVM.jl:86
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/ExprTools/eM8wu_xFL6F.ji for ExprTools [e2ba6199-217a-4e67-a87a-7c52f15ade04]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/TimerOutputs/hd2yD_xFL6F.ji for TimerOutputs [a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/GPUCompiler/yPwef_xFL6F.ji for GPUCompiler [61eb1bfa-7361-4325-ad38-22787b887f55]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/Requires/IyxeS_xFL6F.ji for Requires [ae029012-a4dd-5104-9daa-d747884805df]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/Adapt/rUIgN_xFL6F.ji for Adapt [79e6a3ab-5dfb-504d-930d-738a2a938a0e]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/Reexport/bTpYr_xFL6F.ji for Reexport [189a3867-3050-52da-a836-e630ba90ab69]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/GPUArraysCore/qiYUe_xFL6F.ji for GPUArraysCore [46192b85-c4d5-4398-a991-12ede77f4527]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/GPUArrays/v5u0T_xFL6F.ji for GPUArrays [0c68f7d7-f131-5f86-a1c3-88cf8149b2d7]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/BFloat16s/iiZ8G_xFL6F.ji for BFloat16s [ab4f0b2a-ad5b-11e8-123f-65d77653426b]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/CUDA_Driver_jll/QJyk7_xFL6F.ji for CUDA_Driver_jll [4ee394cb-3365-5eb0-8335-949819d2adfc]
└ @ Base loading.jl:806
┌ Debug: System CUDA driver found at libcuda.so.1, detected as version 12.0.0
└ @ CUDA_Driver_jll ~/.julia/packages/CUDA_Driver_jll/9E4Mc/src/wrappers/x86_64-linux-gnu.jl:76
┌ Debug: System CUDA driver is recent enough; not using forward-compatible driver
└ @ CUDA_Driver_jll ~/.julia/packages/CUDA_Driver_jll/9E4Mc/src/wrappers/x86_64-linux-gnu.jl:94
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/CUDA_Runtime_jll/Hs50y_xFL6F.ji for CUDA_Runtime_jll [76a88914-d11a-5bdc-97e0-2f5a05c973a2]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/RandomNumbers/pgCpR_xFL6F.ji for RandomNumbers [e6cf234a-135c-5ec9-84dd-332b85af5143]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/Random123/1imiM_xFL6F.ji for Random123 [74087812-796a-5b5d-8853-05524746bad3]
└ @ Base loading.jl:806
┌ Debug: Requires conditionally ran code in 0.322408384 seconds: `RandomNumbers` detected `Random123`
└ @ Requires ~/.julia/packages/RandomNumbers/3pD1N/src/RandomNumbers.jl:38
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/Compat/GSFWK_xFL6F.ji for Compat [34da2185-b29b-5c13-b0c7-acf172513d20]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/ChainRulesCore/G6ax7_xFL6F.ji for ChainRulesCore [d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/AbstractFFTs/Di3HZ_xFL6F.ji for AbstractFFTs [621f4979-c628-5d54-868e-fcf4e3e8185c]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/CUDA/oWw5k_xFL6F.ji for CUDA [052768ef-5323-5732-b1bb-66c8b64840ba]
└ @ Base loading.jl:806
┌ Error: No CUDA Runtime library found. This can have several reasons:
│ * you are using an unsupported platform: CUDA.jl only supports Linux (x86_64, aarch64, ppc64le), and Windows (x86_64).
│   refer to the documentation for instructions on how to use a custom CUDA runtime.
│ * you precompiled CUDA.jl in an environment where the CUDA driver was not available.
│   in that case, you need to specify (during pre compilation) which version of CUDA to use.
│   refer to the documentation for instructions on how to use `CUDA.set_runtime_version!`.
│ * you requested use of a local CUDA toolkit, but not all components were discovered.
│   try running with JULIA_DEBUG=CUDA_Runtime_Discovery for more information.
└ @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/initialization.jl:77
ERROR: LoadError: CUDA runtime not found
Stacktrace:
 [1] functional(show_reason::Bool)
   @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/initialization.jl:24
 [2] top-level scope
   @ ~/code/msa/main.jl:7
in expression starting at /home/robin/code/msa/main.jl:7

Oh, you’re using CUDA 12, that’s a breaking release that’s not supported yet by any released version. Try the CUDA.jl master branch, that should work (and I plan to tag a release soon).

1 Like

Actually, scratch that, I was confusing CUDA toolkit with driver compatibility. We should be fine supporting CUDA 12 on currently-released CUDA.jl:

julia> using CUDA
CUDA.versioninfo()

julia> CUDA.versioninfo()
CUDA runtime 11.8, artifact installation
CUDA driver 12.0
NVIDIA driver 525.89.2

Can you show the platform that CUDA.jl uses?

julia> Base.BinaryPlatforms.triplet(CUDA.CUDA_Runtime_jll.host_platform)
"x86_64-linux-gnu-libgfortran5-cxx11-libstdcxx30-cuda+11.8-julia_version+1.8.5"

Testing CUDA.jl#master might be useful though.

The platform:

julia> Base.BinaryPlatforms.triplet(CUDA.CUDA_Runtime_jll.host_platform)
"x86_64-linux-gnu-libgfortran5-cxx11-libstdcxx30-cuda+none-julia_version+1.8.5"

Alright, we’re getting closer. Can you try the following and post the output:

git clone https://github.com/JuliaBinaryWrappers/CUDA_Runtime_jll.jl
cd CUDA_Runtime_jll.jl
git checkout CUDA_Runtime-v0.3.1+1  # or whatever version you were using above
JULIA_DEBUG=all julia --project -e 'using CUDA_Runtime_jll'

Here is the output of the JULIA_DEBUG-command:

JULIA_DEBUG=all julia --project -e 'using CUDA_Runtime_jll'

┌ Debug: Rejecting cache file /home/robin/.julia/compiled/v1.8/CUDA_Runtime_jll/Hs50y_xFL6F.ji because it is for file /home/robin/.julia/packages/CUDA_Runtime_jll/NcfZF/src/CUDA_Runtime_jll.jl not file /home/robin/code/CUDA_Runtime_jll.jl/src/CUDA_Runtime_jll.jl
└ @ Base loading.jl:2135
┌ Debug: Precompiling CUDA_Runtime_jll [76a88914-d11a-5bdc-97e0-2f5a05c973a2]
└ @ Base loading.jl:1664
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/Preferences/pWSk8_xFL6F.ji for Preferences [21216c6a-2e73-6563-6e65-726566657250]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/JLLWrappers/7Zgw7_xFL6F.ji for JLLWrappers [692b3bcd-3c85-4b1f-b108-f13ce0eb3210]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/CUDA_Driver_jll/QJyk7_xFL6F.ji for CUDA_Driver_jll [4ee394cb-3365-5eb0-8335-949819d2adfc]
└ @ Base loading.jl:806
┌ Debug: System CUDA driver found at libcuda.so.1, detected as version 12.0.0
└ @ CUDA_Driver_jll ~/.julia/packages/CUDA_Driver_jll/9E4Mc/src/wrappers/x86_64-linux-gnu.jl:76
┌ Debug: System CUDA driver is recent enough; not using forward-compatible driver
└ @ CUDA_Driver_jll ~/.julia/packages/CUDA_Driver_jll/9E4Mc/src/wrappers/x86_64-linux-gnu.jl:94
┌ Debug: Using CUDA_Driver_jll for driver discovery
└ @ CUDA_Runtime_jll ~/code/CUDA_Runtime_jll.jl/.pkg/platform_augmentation.jl:48
┌ Debug: Found CUDA driver at 'libcuda.so.1'
└ @ CUDA_Runtime_jll ~/code/CUDA_Runtime_jll.jl/.pkg/platform_augmentation.jl:74
┌ Debug: CUDA driver version: 12.0.0
└ @ CUDA_Runtime_jll ~/code/CUDA_Runtime_jll.jl/.pkg/platform_augmentation.jl:106
┌ Debug: Selected CUDA toolkit: 12.0.0
└ @ CUDA_Runtime_jll ~/code/CUDA_Runtime_jll.jl/.pkg/platform_augmentation.jl:131
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/Preferences/pWSk8_xFL6F.ji for Preferences [21216c6a-2e73-6563-6e65-726566657250]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/JLLWrappers/7Zgw7_xFL6F.ji for JLLWrappers [692b3bcd-3c85-4b1f-b108-f13ce0eb3210]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/CUDA_Driver_jll/QJyk7_xFL6F.ji for CUDA_Driver_jll [4ee394cb-3365-5eb0-8335-949819d2adfc]
└ @ Base loading.jl:806
┌ Debug: System CUDA driver found at libcuda.so.1, detected as version 12.0.0
└ @ CUDA_Driver_jll ~/.julia/packages/CUDA_Driver_jll/9E4Mc/src/wrappers/x86_64-linux-gnu.jl:76
┌ Debug: System CUDA driver is recent enough; not using forward-compatible driver
└ @ CUDA_Driver_jll ~/.julia/packages/CUDA_Driver_jll/9E4Mc/src/wrappers/x86_64-linux-gnu.jl:94
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/CUDA_Runtime_jll/Hs50y_66ZMi.ji for CUDA_Runtime_jll [76a88914-d11a-5bdc-97e0-2f5a05c973a2]
└ @ Base loading.jl:806
  Downloaded artifact: CUDA_Runtime

I don’t know which CUDA_Runtime version I am using, but it seems to be 11.8, so I used the same commit as you were using (v0.3.1+1):

nvcc --version
                                                                                 ✔ 
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

Thanks. That log shows that everything works though… Can you show the triplet again in that configuration, i.e., JULIA_DEBUG=all julia --project -e 'using CUDA_Runtime_jll; @show Base.BinaryPlatforms.triplet(CUDA_Runtime_jll.host_platform)'

Also, please check in your original (broken) environment which version of CUDA_Runtime_jll you were using exactly by doing ]st -m

Finally, it may be that a bad situation inadvertently got cached, so you could try loading CUDA.jl again after removing ~/.julia/compiled/*/CUDA_Runtime_jll.

Triplet after the commands done in CUDA_Runtime_jll.jl:

CUDA_Runtime-v0.3.1+1    JULIA_DEBUG=all julia --project -e 'using CUDA_Runtime_jll; @show Base.BinaryPlatforms.triplet(CUDA_Runtime_jll.host_platform)'
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/Preferences/pWSk8_xFL6F.ji for Preferences [21216c6a-2e73-6563-6e65-726566657250]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/JLLWrappers/7Zgw7_xFL6F.ji for JLLWrappers [692b3bcd-3c85-4b1f-b108-f13ce0eb3210]
└ @ Base loading.jl:806
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/CUDA_Driver_jll/QJyk7_xFL6F.ji for CUDA_Driver_jll [4ee394cb-3365-5eb0-8335-949819d2adfc]
└ @ Base loading.jl:806
┌ Debug: System CUDA driver found at libcuda.so.1, detected as version 12.0.0
└ @ CUDA_Driver_jll ~/.julia/packages/CUDA_Driver_jll/9E4Mc/src/wrappers/x86_64-linux-gnu.jl:76
┌ Debug: System CUDA driver is recent enough; not using forward-compatible driver
└ @ CUDA_Driver_jll ~/.julia/packages/CUDA_Driver_jll/9E4Mc/src/wrappers/x86_64-linux-gnu.jl:94
┌ Debug: Loading cache file /home/robin/.julia/compiled/v1.8/CUDA_Runtime_jll/Hs50y_66ZMi.ji for CUDA_Runtime_jll [76a88914-d11a-5bdc-97e0-2f5a05c973a2]
└ @ Base loading.jl:806
Base.BinaryPlatforms.triplet(CUDA_Runtime_jll.host_platform) = "x86_64-linux-gnu-libgfortran5-cxx11-libstdcxx30-cuda+12.0-julia_version+1.8.5"

Output of status then st -m:

(@v1.8) pkg> status
Status `~/.julia/environments/v1.8/Project.toml`
  [052768ef] CUDA v4.0.1

(@v1.8) pkg> st -m
Status `~/.julia/environments/v1.8/Manifest.toml`
  [621f4979] AbstractFFTs v1.3.1
  [79e6a3ab] Adapt v3.6.1
  [ab4f0b2a] BFloat16s v0.4.2
  [fa961155] CEnum v0.4.2
  [052768ef] CUDA v4.0.1
  [1af6417a] CUDA_Runtime_Discovery v0.1.1
  [d360d2e6] ChainRulesCore v1.15.7
  [9e997f8a] ChangesOfVariables v0.1.6
  [34da2185] Compat v4.6.1
  [ffbed154] DocStringExtensions v0.9.3
  [e2ba6199] ExprTools v0.1.9
⌃ [0c68f7d7] GPUArrays v8.6.3
  [46192b85] GPUArraysCore v0.1.4
⌅ [61eb1bfa] GPUCompiler v0.17.3
  [3587e190] InverseFunctions v0.1.8
  [92d709cd] IrrationalConstants v0.2.2
  [692b3bcd] JLLWrappers v1.4.1
  [929cbde3] LLVM v4.16.0
  [2ab3a3ac] LogExpFunctions v0.3.23
  [21216c6a] Preferences v1.3.0
  [74087812] Random123 v1.6.0
  [e6cf234a] RandomNumbers v1.5.3
  [189a3867] Reexport v1.2.2
  [ae029012] Requires v1.3.0
  [276daf66] SpecialFunctions v2.2.0
  [a759f4b9] TimerOutputs v0.5.22
⌅ [4ee394cb] CUDA_Driver_jll v0.2.0+0
⌅ [76a88914] CUDA_Runtime_jll v0.2.3+2
⌅ [dad2f222] LLVMExtra_jll v0.0.16+2
  [efe28fd5] OpenSpecFun_jll v0.5.5+0
  [0dad84c5] ArgTools v1.1.1
  [56f22d72] Artifacts
  [2a0f44e3] Base64
  [ade2ca70] Dates
  [f43a241f] Downloads v1.6.0
  [7b1f6079] FileWatching
  [b77e0a4c] InteractiveUtils
  [4af54fe1] LazyArtifacts
  [b27032c2] LibCURL v0.6.3
  [76f85450] LibGit2
  [8f399da3] Libdl
  [37e2e46d] LinearAlgebra
  [56ddb016] Logging
  [d6f4376e] Markdown
  [ca575930] NetworkOptions v1.2.0
  [44cfe95a] Pkg v1.8.0
  [de0858da] Printf
  [3fa0cd96] REPL
  [9a3f8284] Random
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization
  [6462fe0b] Sockets
  [2f01184e] SparseArrays
  [10745b16] Statistics
  [fa267f1f] TOML v1.0.0
  [a4e569a6] Tar v1.10.1
  [8dfed614] Test
  [cf7118a7] UUIDs
  [4ec0a83e] Unicode
  [e66e0078] CompilerSupportLibraries_jll v1.0.1+0
  [deac9b47] LibCURL_jll v7.84.0+0
  [29816b5a] LibSSH2_jll v1.10.2+0
  [c8ffd9c3] MbedTLS_jll v2.28.0+0
  [14a3606d] MozillaCACerts_jll v2022.2.1
  [4536629a] OpenBLAS_jll v0.3.20+0
  [05823500] OpenLibm_jll v0.8.1+0
  [83775a58] Zlib_jll v1.2.12+3
  [8e850b90] libblastrampoline_jll v5.1.1+0
  [8e850ede] nghttp2_jll v1.48.0+0
  [3f19e933] p7zip_jll v17.4.0+0
Info Packages marked with ⌃ and ⌅ have new versions available, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated -m`

I’m not sure I’m doing this right, but I removed ~/.julia/compiled/*/CUDA_Runtime_jll then tried to run the file containing using Pkg; Pkg.add("CUDA") and got this response:

julia main.jl                                                       ✔ 
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
  No Changes to `~/.julia/environments/v1.8/Project.toml`
  No Changes to `~/.julia/environments/v1.8/Manifest.toml`
Precompiling project...
  2 dependencies successfully precompiled in 32 seconds. 35 already precompiled.
ERROR: LoadError: could not load library "/home/robin/.julia/artifacts/b4f3584e7c5360562ece1d0448b4456c900b69ae/lib/libLLVMExtra-14.so"
libLLVM-14jl.so: cannot open shared object file: No such file or directory
Stacktrace:
  [1] LLVMAddInternalizePassWithExportList
    @ ~/.julia/packages/LLVM/s3bxG/lib/libLLVM_extra.jl:104 [inlined]
  [2] internalize!
    @ ~/.julia/packages/LLVM/s3bxG/src/transform.jl:164 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/GPUCompiler/S3TWf/src/irgen.jl:93 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/LLVM/s3bxG/src/base.jl:102 [inlined]
  [5] macro expansion
    @ ~/.julia/packages/TimerOutputs/LHjFw/src/TimerOutput.jl:253 [inlined]
  [6] irgen(job::GPUCompiler.CompilerJob, method_instance::Core.MethodInstance; ctx::LLVM.Context)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/irgen.jl:82
  [7] macro expansion
    @ ~/.julia/packages/GPUCompiler/S3TWf/src/driver.jl:219 [inlined]
  [8] macro expansion
    @ ~/.julia/packages/TimerOutputs/LHjFw/src/TimerOutput.jl:253 [inlined]
  [9] macro expansion
    @ ~/.julia/packages/GPUCompiler/S3TWf/src/driver.jl:218 [inlined]
 [10] emit_llvm(job::GPUCompiler.CompilerJob, method_instance::Any; libraries::Bool, deferred_codegen::Bool, optimize::Bool, cleanup::Bool, only_entry::Bool, validate::Bool, ctx::LLVM.Context)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/utils.jl:83
 [11] cufunction_compile(job::GPUCompiler.CompilerJob, ctx::LLVM.Context)
    @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:360
 [12] #221
    @ ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:354 [inlined]
 [13] JuliaContext(f::CUDA.var"#221#222"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(saxpy_gpu!), Tuple{CuDeviceVector{Float16, 1}, Float16, CuDeviceVector{Float16, 1}, CuDeviceVector{Float16, 1}}}}})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/driver.jl:76
 [14] cufunction_compile(job::GPUCompiler.CompilerJob)
    @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:353
 [15] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/cache.jl:90
 [16] cufunction(f::typeof(saxpy_gpu!), tt::Type{Tuple{CuDeviceVector{Float16, 1}, Float16, CuDeviceVector{Float16, 1}, CuDeviceVector{Float16, 1}}}; name::Nothing, always_inline::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:306
 [17] cufunction(f::typeof(saxpy_gpu!), tt::Type{Tuple{CuDeviceVector{Float16, 1}, Float16, CuDeviceVector{Float16, 1}, CuDeviceVector{Float16, 1}}})
    @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:299
 [18] top-level scope
    @ ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:102
in expression starting at /home/robin/code/msa/main.jl:38

I can however now use using CUDA then CUDA.versioninfo() in the REPL:

julia> CUDA.versioninfo()
CUDA runtime 11.8, artifact installation
CUDA driver 12.0
NVIDIA driver 525.89.2

Libraries: 
- CUBLAS: 11.11.3
- CURAND: 10.3.0
- CUFFT: 10.9.0
- CUSOLVER: 11.4.1
- CUSPARSE: 11.7.5
- CUPTI: 18.0.0
- NVML: 12.0.0+525.89.2

Toolchain:
- Julia: 1.8.5
- LLVM: 14.0.6
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5
- Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

1 device:
  0: NVIDIA GeForce MX330 (sm_61, 12.375 MiB / 2.000 GiB available)

I tried running the same code by just pasting it into the REPL, it worked until it came across this part:

julia> # Check if I need these
       # @btime CUDA.@sync
       @cuda(
           threads = nThreads,
           blocks = nBlocks,
           saxpy_gpu!(z, a, x, y)
       )
ERROR: could not load library "/home/robin/.julia/artifacts/b4f3584e7c5360562ece1d0448b4456c900b69ae/lib/libLLVMExtra-14.so"
libLLVM-14jl.so: cannot open shared object file: No such file or directory
Stacktrace:
  [1] LLVMAddInternalizePassWithExportList
    @ ~/.julia/packages/LLVM/s3bxG/lib/libLLVM_extra.jl:104 [inlined]
  [2] internalize!
    @ ~/.julia/packages/LLVM/s3bxG/src/transform.jl:164 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/GPUCompiler/S3TWf/src/irgen.jl:93 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/LLVM/s3bxG/src/base.jl:102 [inlined]
  [5] macro expansion
    @ ~/.julia/packages/TimerOutputs/LHjFw/src/TimerOutput.jl:253 [inlined]
  [6] irgen(job::GPUCompiler.CompilerJob, method_instance::Core.MethodInstance; ctx::LLVM.Context)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/irgen.jl:82
  [7] macro expansion
    @ ~/.julia/packages/GPUCompiler/S3TWf/src/driver.jl:219 [inlined]
  [8] macro expansion
    @ ~/.julia/packages/TimerOutputs/LHjFw/src/TimerOutput.jl:253 [inlined]
  [9] macro expansion
    @ ~/.julia/packages/GPUCompiler/S3TWf/src/driver.jl:218 [inlined]
 [10] emit_llvm(job::GPUCompiler.CompilerJob, method_instance::Any; libraries::Bool, deferred_codegen::Bool, optimize::Bool, cleanup::Bool, only_entry::Bool, validate::Bool, ctx::LLVM.Context)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/utils.jl:83
 [11] cufunction_compile(job::GPUCompiler.CompilerJob, ctx::LLVM.Context)
    @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:360
 [12] #221
    @ ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:354 [inlined]
 [13] JuliaContext(f::CUDA.var"#221#222"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(saxpy_gpu!), Tuple{CuDeviceVector{Float16, 1}, Float16, CuDeviceVector{Float16, 1}, CuDeviceVector{Float16, 1}}}}})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/driver.jl:76
 [14] cufunction_compile(job::GPUCompiler.CompilerJob)
    @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:353
 [15] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/cache.jl:90
 [16] cufunction(f::typeof(saxpy_gpu!), tt::Type{Tuple{CuDeviceVector{Float16, 1}, Float16, CuDeviceVector{Float16, 1}, CuDeviceVector{Float16, 1}}}; name::Nothing, always_inline::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:306
 [17] cufunction(f::typeof(saxpy_gpu!), tt::Type{Tuple{CuDeviceVector{Float16, 1}, Float16, CuDeviceVector{Float16, 1}, CuDeviceVector{Float16, 1}}})
    @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:299
 [18] top-level scope
    @ ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:102
 [19] top-level scope
    @ ~/.julia/packages/CUDA/ZdCxS/src/initialization.jl:155

Here is my code if there are any obvious mistakes: GitHub - robvold/msa at cuda_testing

Again, thank you so much for helping me!

That’s not possible, Julia 1.8.5 uses LLVM 13. I guess you’re using a distro version of Julia (i.e., not downloaded from the Julia home page)? That’s not supported, and would explain the error you’re encountering.

You are correct, I did. Now it works! Thank you so much for the help, and sorry for taking up your time for a problem so trivial.