this is ubuntu 18.04 (I think). I have installed julia and the sysop has installed CUDA, but I get
using CuArrays
[ Info: CUDAdrv.jl failed to initialize, GPU functionality unavailable (set JULIA_CUDA_SILENT or JULIA_CUDA_VERBOSE to silence or expand this message)
and no fun whatsoever
As I have no clue where to start from, can anybody here help me to track this error, and fix it? Can it be related to the fact that this computer has two nvidia GPU’s ?
Are any of those paths on your library search path? What happens if you do:
julia> using Libdl
julia> Libdl.dlopen("libcuda")
Ptr{Nothing} @0x000055fffd890900
That should just work, if not there’s something wrong with your local set-up. For example, you might need to add one of those paths to some ld.so.conf entry (a file in /etc/ld.so.conf.d). Or you might be missing dependent libraries, try the dlopen with LD_DEBUG=all, or try doing ldd on the libcuda.so that should be open-able.
Oh thanks, talking to the sysop here things have been fixed, there was some sort of problem with the CUDA install. Now it is working… partly! Strangely enough, now with my Tesla GPU with 12Gb RAM, am unable to run codes that I can safely run with a GFX760 and a 1050Ti, both with 4Gb.
In my codes where I generate random numbers for sampling, I’m getting messages like
ERROR: LoadError: CURANDError(code CURAND_STATUS_ALLOCATION_FAILED, Memory allocation failed)
Stacktrace:
[1] macro expansion at /home/mazzanti/.julia/packages/CuArrays/7z7MV/src/rand/error.jl:51 [inlined]
[2] curandGenerateSeeds at /home/mazzanti/.julia/packages/CuArrays/7z7MV/src/rand/libcurand.jl:151 [inlined]
[3] seed!(::CuArrays.CURAND.RNG, ::Int64, ::Int64) at /home/mazzanti/.julia/packages/CuArrays/7z7MV/src/rand/random.jl:40
[4] seed! at /home/mazzanti/.julia/packages/CuArrays/7z7MV/src/rand/random.jl:36 [inlined] (repeats 2 times)
[5] top-level scope at /home/mazzanti/Julia_1/AIS_NEW_2/AIS_CUDA/AIS_1_CUDA_J1_2.jl:364
[6] include at ./boot.jl:328 [inlined]
[7] include_relative(::Module, ::String) at ./loading.jl:1094
[8] include(::Module, ::String) at ./Base.jl:31
[9] exec_options(::Base.JLOptions) at ./client.jl:295
[10] _start() at ./client.jl:464
…errors related to CuRAND? I have installed CuArrays, CUDAdrv and CUDAnative. What can be causing this here?
Yes, I call CuArrays.seed!(iseed) for some iseed number. Commenting this line out solves the problem.
…but then, how am I supposed to initialize the random seed?
Thanks again,
Ferran.
I’m not saying it’s bad to call that method, you just happen to do so at a point where CuArrays has presumably used (and cached) all of your GPU’s memory. That’s a bug Would need something like https://github.com/JuliaGPU/CuArrays.jl/issues/426 to resolve. For now, can you try calling GC.gc(true) and CuArrays.BinnedPool.reclaim(true) before the call to seed!? That should free up memory.
using CuArrays
using CUDAdrv
using CUDAnative
GC.gc(true)
CuArrays.BinnedPool.reclaim(true)
UndefVarError: BinnedPool not defined
Stacktrace:
[1] getproperty(::Module, ::Symbol) at ./Base.jl:13
[2] top-level scope at In[5]:2
I have just done ] update and nothing related to CuArrays updated, so I would say that yes, I have the latest version of CuArrays.
In any case, you want me to try ] add CuArrays#a3b4bf4
that gave me an error also
ERROR: Unsatisfiable requirements detected for package CUDAnative [be33ccc6]:
CUDAnative [be33ccc6] log:
├─possible versions are: [0.7.0, 0.8.0-0.8.10, 0.9.0-0.9.1, 0.10.0-0.10.1, 1.0.0-1.0.1, 2.0.0-2.0.1, 2.1.0-2.1.3, 2.2.0-2.2.1, 2.3.0-2.3.1, 2.4.0, 2.5.0-2.5.5] or uninstalled
├─restricted to versions 2.5.0-2 by CuArrays [3a865a2d], leaving only versions 2.5.0-2.5.5
│ └─CuArrays [3a865a2d] log:
│ ├─possible versions are: 1.4.7 or uninstalled
│ └─CuArrays [3a865a2d] is fixed to version 1.4.7
└─restricted to versions 2.4.0 by an explicit requirement — no versions left
No, that’s not how it works. CuArrays might be held back because of other packages, and a ] st would show which version is installed. From the output below, that’s exactly what’s happening: for some reason, you have an explicit dependency on CUDAnative 2.4.0, which holds back CuArrays to version 1.3. But even with that version BinnedPool exists, so I’m not sure what’s up, hard to guess without the actual version of the package installed.
So then, would you say I’d rather uninstall CuArrays, CUDAnative & CUDAarv (to make sure, all three) and install them back (which would get the very latest version I guess) ?
If libcuda is not found, it basically means that libcuda.so was not found in LD_LIBRARY_PATH. This is almost always true for such issues.
The steps to troubleshoot “cound not load library libfoo” are:
Investigate the list of library search paths → echo $LD_LIBRARY_PATH
Find the location of libfoo.so (in your case libcuda.so) → e.g. find /usr -name libfoo.so
Check if the parent directory of libfoo.so is listed in $LD_LIBRARY_PATH
If not 3. → put the path of libfoo.so into $LD_LIBRARY_PATH → e.g. put export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/path/from/step/two" into ~/.bashrc or just run the line in the shell for a quick check
The step 4 is for Bash and Zsh. For Csh you need to use setenv ....
That should be it.
If libcuda.so was not found, it needs to be installed