Can't load CuArrays

Hi,

this is ubuntu 18.04 (I think). I have installed julia and the sysop has installed CUDA, but I get

using CuArrays

[ Info: CUDAdrv.jl failed to initialize, GPU functionality unavailable (set JULIA_CUDA_SILENT or JULIA_CUDA_VERBOSE to silence or expand this message)

and no fun whatsoever :frowning:

As I have no clue where to start from, can anybody here help me to track this error, and fix it? Can it be related to the fact that this computer has two nvidia GPU’s ?

Thx in advance,

Ferran.

Please try the suggestion. Set JULIA_CUDA_VERBOSE=true and try again.

1 Like

Hi,
just did that and got

julia> using CuArrays
┌ Error: CUDAdrv.jl failed to initialize
│   exception =
│    could not load library "libcuda"
│    libcuda.so: cannot open shared object file: No such file or directory
│    Stacktrace:
│     [1] #dlopen#3(::Bool, ::typeof(Libdl.dlopen), ::String, ::UInt32) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.2/Libdl/src/Libdl.jl:109
│     [2] dlopen at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.2/Libdl/src/Libdl.jl:109 [inlined]
│     [3] #dlopen#2 at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.2/Libdl/src/Libdl.jl:105 [inlined]
│     [4] dlopen at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.2/Libdl/src/Libdl.jl:105 [inlined] (repeats 2 times)
│     [5] (::getfield(CUDAdrv, Symbol("##407#lookup_fptr#83")))() at /home/mazzanti/.julia/packages/CUDAapi/CCgJL/src/call.jl:29
│     [6] macro expansion at /home/mazzanti/.julia/packages/CUDAapi/CCgJL/src/call.jl:37 [inlined]
│     [7] macro expansion at /home/mazzanti/.julia/packages/CUDAdrv/3EzC1/src/error.jl:121 [inlined]
│     [8] cuInit(::Int64) at /home/mazzanti/.julia/packages/CUDAdrv/3EzC1/src/libcuda.jl:18
│     [9] __init__() at /home/mazzanti/.julia/packages/CUDAdrv/3EzC1/src/CUDAdrv.jl:56
│     [10] _include_from_serialized(::String, ::Array{Any,1}) at ./loading.jl:685
│     [11] _require_search_from_serialized(::Base.PkgId, ::String) at ./loading.jl:765
│     [12] _tryrequire_from_serialized(::Base.PkgId, ::UInt64, ::String) at ./loading.jl:700
│     [13] _require_search_from_serialized(::Base.PkgId, ::String) at ./loading.jl:754
│     [14] _require(::Base.PkgId) at ./loading.jl:990
│     [15] require(::Base.PkgId) at ./loading.jl:911
│     [16] require(::Module, ::Symbol) at ./loading.jl:906
│     [17] eval(::Module, ::Any) at ./boot.jl:330
│     [18] eval_user_input(::Any, ::REPL.REPLBackend) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.2/REPL/src/REPL.jl:86
│     [19] macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.2/REPL/src/REPL.jl:118 [inlined]
│     [20] (::getfield(REPL, Symbol("##26#27")){REPL.REPLBackend})() at ./task.jl:268
└ @ CUDAdrv ~/.julia/packages/CUDAdrv/3EzC1/src/CUDAdrv.jl:67
┌ Warning: CUDAnative.jl did not initialize because CUDAdrv.jl failed to
└ @ CUDAnative ~/.julia/packages/CUDAnative/2WQzk/src/CUDAnative.jl:63
┌ Warning: CuArrays.jl did not initialize because CUDAdrv.jl or CUDAnative.jl failed to
└ @ CuArrays ~/.julia/packages/CuArrays/7z7MV/src/CuArrays.jl:69

while libcuda IS in my system

locate libcuda.so

/usr/lib/i386-linux-gnu/libcuda.so
/usr/lib/i386-linux-gnu/libcuda.so.1
/usr/lib/i386-linux-gnu/libcuda.so.430.26
/usr/lib/x86_64-linux-gnu/libcuda.so.1
/usr/lib/x86_64-linux-gnu/libcuda.so.430.26
/usr/local/cuda-10.0/doc/man/man7/libcuda.so.7
/usr/local/cuda-10.0/lib64/stubs/libcuda.so
/usr/share/man/man7/libcuda.so.7

so what can be happening here?

Thanks for your help,

Ferran

Are any of those paths on your library search path? What happens if you do:

julia> using Libdl

julia> Libdl.dlopen("libcuda")
Ptr{Nothing} @0x000055fffd890900

That should just work, if not there’s something wrong with your local set-up. For example, you might need to add one of those paths to some ld.so.conf entry (a file in /etc/ld.so.conf.d). Or you might be missing dependent libraries, try the dlopen with LD_DEBUG=all, or try doing ldd on the libcuda.so that should be open-able.

Oh thanks, talking to the sysop here things have been fixed, there was some sort of problem with the CUDA install. Now it is working… partly! Strangely enough, now with my Tesla GPU with 12Gb RAM, am unable to run codes that I can safely run with a GFX760 and a 1050Ti, both with 4Gb.
In my codes where I generate random numbers for sampling, I’m getting messages like

ERROR: LoadError: CURANDError(code CURAND_STATUS_ALLOCATION_FAILED, Memory allocation failed)
Stacktrace:
 [1] macro expansion at /home/mazzanti/.julia/packages/CuArrays/7z7MV/src/rand/error.jl:51 [inlined]
 [2] curandGenerateSeeds at /home/mazzanti/.julia/packages/CuArrays/7z7MV/src/rand/libcurand.jl:151 [inlined]
 [3] seed!(::CuArrays.CURAND.RNG, ::Int64, ::Int64) at /home/mazzanti/.julia/packages/CuArrays/7z7MV/src/rand/random.jl:40
 [4] seed! at /home/mazzanti/.julia/packages/CuArrays/7z7MV/src/rand/random.jl:36 [inlined] (repeats 2 times)
 [5] top-level scope at /home/mazzanti/Julia_1/AIS_NEW_2/AIS_CUDA/AIS_1_CUDA_J1_2.jl:364
 [6] include at ./boot.jl:328 [inlined]
 [7] include_relative(::Module, ::String) at ./loading.jl:1094
 [8] include(::Module, ::String) at ./Base.jl:31
 [9] exec_options(::Base.JLOptions) at ./client.jl:295
 [10] _start() at ./client.jl:464

…errors related to CuRAND? I have installed CuArrays, CUDAdrv and CUDAnative. What can be causing this here?

Thanks again,

Ferran

This is similar to https://github.com/JuliaGPU/CuArrays.jl/issues/491 – we don’t expect any allocations (here, by curandGenerateSeeds) to happen outside of the memory pool.

I take it you call seed! explicitly?

Yes, I call CuArrays.seed!(iseed) for some iseed number. Commenting this line out solves the problem.
…but then, how am I supposed to initialize the random seed?
Thanks again,
Ferran.

I’m not saying it’s bad to call that method, you just happen to do so at a point where CuArrays has presumably used (and cached) all of your GPU’s memory. That’s a bug :slight_smile: Would need something like https://github.com/JuliaGPU/CuArrays.jl/issues/426 to resolve. For now, can you try calling GC.gc(true) and CuArrays.BinnedPool.reclaim(true) before the call to seed!? That should free up memory.

Could you try https://github.com/JuliaGPU/CuArrays.jl/pull/504 ?

1 Like

Yeah sure… willing to help/test.
But do I have to do exactly :smiley: ?

BTW

using CuArrays
using CUDAdrv
using CUDAnative

GC.gc(true)
CuArrays.BinnedPool.reclaim(true)

UndefVarError: BinnedPool not defined

Stacktrace:
 [1] getproperty(::Module, ::Symbol) at ./Base.jl:13
 [2] top-level scope at In[5]:2

Are you using the latest version of CuArrays?
You can check-out branches using ] add CuArrays#branchname.

I have just done ] update and nothing related to CuArrays updated, so I would say that yes, I have the latest version of CuArrays.
In any case, you want me to try
] add CuArrays#a3b4bf4
that gave me an error also

ERROR: Unsatisfiable requirements detected for package CUDAnative [be33ccc6]:
 CUDAnative [be33ccc6] log:
 ├─possible versions are: [0.7.0, 0.8.0-0.8.10, 0.9.0-0.9.1, 0.10.0-0.10.1, 1.0.0-1.0.1, 2.0.0-2.0.1, 2.1.0-2.1.3, 2.2.0-2.2.1, 2.3.0-2.3.1, 2.4.0, 2.5.0-2.5.5] or uninstalled
 ├─restricted to versions 2.5.0-2 by CuArrays [3a865a2d], leaving only versions 2.5.0-2.5.5
 │ └─CuArrays [3a865a2d] log:
 │   ├─possible versions are: 1.4.7 or uninstalled
 │   └─CuArrays [3a865a2d] is fixed to version 1.4.7
 └─restricted to versions 2.4.0 by an explicit requirement — no versions left

No, that’s not how it works. CuArrays might be held back because of other packages, and a ] st would show which version is installed. From the output below, that’s exactly what’s happening: for some reason, you have an explicit dependency on CUDAnative 2.4.0, which holds back CuArrays to version 1.3. But even with that version BinnedPool exists, so I’m not sure what’s up, hard to guess without the actual version of the package installed.

So then, would you say I’d rather uninstall CuArrays, CUDAnative & CUDAarv (to make sure, all three) and install them back (which would get the very latest version I guess) ?

I meet the same problem to initialize the CUDAdrv, then I try this libcuda, seems also not works.
How can I fix this problem

If libcuda is not found, it basically means that libcuda.so was not found in LD_LIBRARY_PATH. This is almost always true for such issues.

The steps to troubleshoot “cound not load library libfoo” are:

  1. Investigate the list of library search paths → echo $LD_LIBRARY_PATH
  2. Find the location of libfoo.so (in your case libcuda.so) → e.g. find /usr -name libfoo.so
  3. Check if the parent directory of libfoo.so is listed in $LD_LIBRARY_PATH
  4. If not 3. → put the path of libfoo.so into $LD_LIBRARY_PATH → e.g. put export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/path/from/step/two" into ~/.bashrc or just run the line in the shell for a quick check

The step 4 is for Bash and Zsh. For Csh you need to use setenv ....

That should be it.

If libcuda.so was not found, it needs to be installed :wink:

1 Like

It’s better to add a path to /etc/ld.so.conf or an entry to /etc/ld.so.conf.d since LD_LIBRARY_PATH can get overridden easily.

If you’re missing libcuda you probably don’t have the NVIDIA driver installed, or only parts of it.

1 Like