CuArray can't find libcuda

I installed cuda10.2 on a ubuntu 18.10, and was able to build and run the some examples from the Cuda example folder, such as devicequery.

But when I use Julia1.0.5LTS, and CuArray can’t find libcuda.

and using JULIA_DEBUG=CUDAapi does not show me the directory probe information… Not sure how to solve this issue at this point. Would someone be able to point me to the right direction? Thanks in advance.

$ JULIA_DEBUG=CUDAapi julia-1.0.5
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.0.5 (2019-09-09)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using CuArrays
[ Info: CUDAdrv.jl failed to initialize, GPU functionality unavailable (set JULIA_CUDA_SILENT or JULIA_CUDA_VERBOSE to silence or expand this message)
[ Info: Recompiling stale cache file /home/ctrotter/.julia/compiled/v1.0/CuArrays/7YFE0.ji for CuArrays [3a865a2d-5b23-5a0f-bc46-62713ec82fae]

Did you try out the suggestion from the error message?

Here is the information printed. But it doesn’t show where it is looking for libcuda.so

julia> ENV["JULIA_CUDA_VERBOSE"]=true
true
julia> using CuArrays
┌ Error: CUDAdrv.jl failed to initialize
│   exception =
│    could not load library "libcuda"
│    libcuda.so: cannot open shared object file: No such file or directory
│    Stacktrace:
│     [1] dlopen(::String, ::UInt32) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/Libdl/src/Libdl.jl:97
│     [2] dlopen at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/Libdl/src/Libdl.jl:94 [inlined] (repeats 2 times)
│     [3] (::getfield(CUDAdrv, Symbol("##408#lookup_fptr#83")))() at /home/ctrotter/.julia/packages/CUDAapi/K94wY/src/call.jl:29
│     [4] __init__() at /home/ctrotter/.julia/packages/CUDAapi/K94wY/src/call.jl:37
│     [5] _include_from_serialized(::String, ::Array{Any,1}) at ./loading.jl:633
│     [6] _require_search_from_serialized(::Base.PkgId, ::String) at ./loading.jl:713
│     [7] _tryrequire_from_serialized(::Base.PkgId, ::UInt64, ::String) at ./loading.jl:648
│     [8] _require_search_from_serialized(::Base.PkgId, ::String) at ./loading.jl:702
│     [9] _tryrequire_from_serialized(::Base.PkgId, ::UInt64, ::String) at ./loading.jl:648
│     [10] _require_search_from_serialized(::Base.PkgId, ::String) at ./loading.jl:702
│     [11] _require(::Base.PkgId) at ./loading.jl:937
│     [12] require(::Base.PkgId) at ./loading.jl:858
│     [13] require(::Module, ::Symbol) at ./loading.jl:853
│     [14] eval(::Module, ::Any) at ./boot.jl:319
│     [15] eval_user_input(::Any, ::REPL.REPLBackend) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/REPL/src/REPL.jl:85
│     [16] macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/REPL/src/REPL.jl:117 [inlined]
│     [17] (::getfield(REPL, Symbol("##28#29")){REPL.REPLBackend})() at ./task.jl:259
└ @ CUDAdrv ~/.julia/packages/CUDAdrv/aBgcd/src/CUDAdrv.jl:67
┌ Warning: CUDAnative.jl did not initialize because CUDAdrv.jl failed to
└ @ CUDAnative ~/.julia/packages/CUDAnative/Phjco/src/CUDAnative.jl:66
┌ Warning: CuArrays.jl did not initialize because CUDAdrv.jl or CUDAnative.jl failed to
└ @ CuArrays ~/.julia/packages/CuArrays/rNxse/src/CuArrays.jl:64

Nowhere, it should be readily available since it’s a OS/kernel-dependent library. So it should be discoverable using dlopen without additional arguments. If it isn’t, you might had to add it to ld.so.conf or use LD_LIBRARY_PATH as a workaround.

$ echo $LD_LIBRARY_PATH
/usr/lib/x86_64-linux-gnu/:/usr/local/cuda-10.2/lib64

$ locate libcuda.so
/usr/lib/x86_64-linux-gnu/libcuda.so.1
/usr/lib/x86_64-linux-gnu/libcuda.so.440.33.01
/usr/local/cuda-10.2/doc/man/man7/libcuda.so.7
/usr/local/cuda-10.2/targets/x86_64-linux/lib/stubs/libcuda.so

I added /usr/lib/x86_64-linux-gnu/ to LD_LIBRARY_PATH in an effort to solve this issue. Originally it was just /usr/local/cuda-10.2/lib64 and I see that libcuda.so is not there.

So does it work now? Can you dlopen libcuda? FWIW, installing CUDA (however you did) should have added an entry to /etc/ld.so.conf.d/.

Thank you for the reply.

There is an entry:

$ sudo cat /etc/ld.so.conf.d/cuda-10-2.conf
/usr/local/cuda-10.2/targets/x86_64-linux/lib

Looking at where libcuda.so is located, I notice there is one more level down the directory where libcuda.so (…/lib/stub/libcuda.so). But adding /stubs to the conf file didn’t solve the issue Edit: I put /usr/local/cuda-10.2 in cuda-10-2.conf file, and reboot, and it worked, too.

Eventually, I solved this by doing one of the following.

  1. Add the location of libcuda.so to LD_LIBRARY_PATH. In my case, the last line of the result shown with locate libcuda.so
    $export LD_LIBRARY_PATH=/usr/local/cuda-10.2/targets/x86_64-linux/lib/stubs/:$LD_LIBRARY_PATH

  2. Or make a link in /usr/lib/x86_64-linux-gnu/
    $sudo ln -s /usr/lib/x86_64-linux-gnu/libcuda.so.1 /usr/lib/x86_64-linux-gnu/libcuda.so
    I found out this solution because Libdl.dlopen("libcuda.so.1") works.

Given that not everyone have sudo privilege, maybe the first one is a better solution.

Thanks for your help! @maleadt

But you need root privileges to install CUDA anyway, at which point an ld.so.conf entry should have been set-up.

Anyway, glad you got it to work! And FWIW, after modifying the ld.so settings you need to run ldconfig (a reboot wouldn’t have been necessary then).

I did try ldconfig after, but somehow it didn’t update the cache. I was getting the same error.

Also, I am so happy to see documentation for CUDA.jl. It is very appreciated!

I probably should open another thread for this question, but maybe you @maleadt have a quick answer.

Can the CUDA.jl calls be compiled into an executable?
In other words, I am compiling everything in my application to an executable for deployment, some of the functions require GPU. If this is not a short answer, I will be happy to create another question and include a minimal working example.

PackageCompiler-style? No, that’s not supported. It used to be possible to write the compiled PTX kernels to disk and load them instead of recompiling, but that didn’t help much (compilation is fairly quickly, tens of ms for reasonable kernels, it’s the initial compilation of CUDAnative that takes long).