A newbie can not use CUDA by follow official DOC!

New PC, New Julia, New CUDA package and just totally follow ““https://juliagpu.gitlab.io/CUDA.jl/”” but it doesn’t work!

using CUDAdrv, CUDAnative, CuArrays

get a error:
┌ Info: CUDAnative.jl failed to initialize, GPU functionality unavailable (set JULIA_CUDA_SILENT or JULIA_CUDA_VERBOSE to silence or expand this message)
└ @ CUDAnative C:\Users\USER.julia\packages\CUDAnative\hfulr\src\CUDAnative.jl:192

OS: Win10, GPU: GTX1650super, Julia 1.3.1

Have you installed the CUDA drivers and toolkit and tested that it is working?
https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html

1 Like

Did you follow this advise, and set the env var JULIA_CUDA_VERBOSE to get some more details? You’re not the first to miss this hint, so no worries. Never thought it would be so confusing; next CUDAnative/CuArrays release will get rid of it.

3 Likes

I had similar problem in Ubuntu 18.04.

I set JULIA_CUDA_VERBOSE=true in the bashrc file. I still get the same error:

Info: CUDAdrv.jl failed to initialize, GPU functionality unavailable (set JULIA_CUDA_SILENT or JULIA_CUDA_VERBOSE to silence or expand this message)
└ @ CUDAdrv /home/mehrdad/.julia/packages/CUDAdrv/aBgcd/src/CUDAdrv.jl:69

After a fresh restart, I get the following error:

Something seems off here. Julia failed to render the stack-trace passed to @error, which is something that has worked since 1.0. Which version of Julia are you using?

I am using version 1.2. I have updated the packages and followed the instructions for CUDA installation.

Is there a way to resolve this please?

What is the output of
nvidia-smi?

You could also try to move ~/.julia to e.g. ~/.julia.bak and reinstall the packages, it has sometimes helped with strange problems.

I am using CUDAdrv in a Ubuntu 18.04 box without problems (so there is hope :slight_smile:) .

This is the output ofnvidia-smi:

Tue Apr 21 16:25:55 2020
±----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 960M On | 00000000:01:00.0 Off | N/A |
| N/A 63C P0 N/A / N/A | 587MiB / 2004MiB | 5% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1478 G /usr/bin/gnome-shell 116MiB |
| 0 4612 G /usr/lib/xorg/Xorg 469MiB |
±----------------------------------------------------------------------------+

Have you tried the latest Julia release, 1.4.1?

I uninstalled all packages and deleted Julia. Then installed Julia 1.4.1 and all the packages. Now, we I execute “using CUDAdrv”, I do not get an error. However, when I execute the following program (from the nvidia website https://devblogs.nvidia.com/gpu-computing-julia-programming-language/):

using CUDAdrv, CUDAnative

function kernel_vadd(a, b, c)
i = threadIdx().x
c[i] = a[i] + b[i]
return
end

generate some data

len = 512
a = rand(Int, len)
b = rand(Int, len)

allocate & upload to the GPU

d_a = CuArray(a)
d_b = CuArray(b)
d_c = similar(d_a)

execute and fetch results

@cuda (1,len) kernel_vadd(d_a, d_b, d_c)
c = Array(d_c)

I get the following error (the error is long, so I pasted it in a Google doc)

Please use appropriate markup for code blocks, your posts are hard to read otherwise. See PSA: make it easier to help you

As I said, there seems to be something up with your Julia, as it does not render the exception (anything in your startup.jl?)

Also, the error being thrown is one not known to the wrapped CUDA headers. Could you try the following to see the actual code:

julia> using CUDAdrv
[ Info: Precompiling CUDAdrv [c5f51814-7f29-56b8-a69c-e4d8f6be1fde]

julia> CUDAdrv.unsafe_cuCtxGetCurrent(Ref{CUDAdrv.CUcontext}())
CUDA_SUCCESS::cudaError_enum = 0x00000000

julia> Int(ans)
0
2 Likes

I apologize for the bad write-up.

There are some materials related to Python is in the startup.jl file. Shall I delete those?

When I execute using CUDAdrv, nothing happens. It should usually show the precompiling info, I guess).

When I execute CUDAdrv.unsafe_cuCtxGetCurrent(Ref{CUDAdrv.CUcontext}()), I get the following error

┌ Error: Could not initialize CUDA
│   exception = (CuError(CUDAdrv.UnknownMember, nothing), Union{Ptr{Nothing}, Base.InterpreterIP}[Ptr{Nothing} @0x00007fcb548d179e, Ptr{Nothing} @0x00007fcb548d2d24, Ptr{Nothing} @0x00007fcb548d3c1d, Ptr{Nothing} @0x00007fcb548d3e6b, Ptr{Nothing} @0x00007fcb548d4063, Ptr{Nothing} @0x00007fcb548d4110, Ptr{Nothing} @0x00007fcb548d4219, Ptr{Nothing} @0x00007fcb548d1341, Ptr{Nothing} @0x00007fcb548d1391, Ptr{Nothing} @0x00007fcb8298b062, Ptr{Nothing} @0x00007fcb829a3b65, Ptr{Nothing} @0x00007fcb829a37af, Ptr{Nothing} @0x00007fcb829a4d63, Ptr{Nothing} @0x00007fcb829a5fa7, Base.InterpreterIP in top-level CodeInfo for Main at statement 4, Ptr{Nothing} @0x00007fcb829c32e9, Ptr{Nothing} @0x00007fcb829c3f34, Ptr{Nothing} @0x00007fcb548c902e, Ptr{Nothing} @0x00007fcb8298b062, Ptr{Nothing} @0x00007fcb548ba1be, Ptr{Nothing} @0x00007fcb8298b40a, Ptr{Nothing} @0x00007fcb8299a5cb, Ptr{Nothing} @0x00007fcb8299ae81, Ptr{Nothing} @0x00007fcb548aded2, Ptr{Nothing} @0x00007fcb548ae0ae, Ptr{Nothing} @0x00007fcb548ae0cc, Ptr{Nothing} @0x00007fcb8298b062, Ptr{Nothing} @0x00007fcb829a9a5e, Ptr{Nothing} @0x0000000000000000])
└ @ CUDAdrv /home/mehrdad/.julia/packages/CUDAdrv/YK1gX/src/CUDAdrv.jl:106
AssertionError: CUDAdrv.jl did not successfully initialize, and is not usable.

Stacktrace:
 [1] libcuda at /home/mehrdad/.julia/packages/CUDAdrv/YK1gX/src/CUDAdrv.jl:82 [inlined]
 [2] (::CUDAdrv.var"#618#cache_fptr!#140")() at /home/mehrdad/.julia/packages/CUDAapi/XuSHC/src/call.jl:31
 [3] macro expansion at /home/mehrdad/.julia/packages/CUDAapi/XuSHC/src/call.jl:39 [inlined]
 [4] unsafe_cuCtxGetCurrent(::Base.RefValue{Ptr{Nothing}}) at /home/mehrdad/.julia/packages/CUDAdrv/YK1gX/src/libcuda.jl:145
 [5] top-level scope at In[2]:1

Oh, so it’s actually cuInit that fails. Try the following then:

julia> using CUDAdrv

julia> CUDAdrv.CUDAapi.@runtime_ccall((:cuInit, CUDAdrv.__libcuda), CUDAdrv.CUresult, (UInt32,), 0)
CUDA_SUCCESS::cudaError_enum = 0x00000000

julia> Int(ans)
0

This is the error I get

UnknownMember::cudaError_enum = 0xffffffff

This return code of -1 indicates you are using the CUDA stub libraries. You probably have not installed the NVIDIA driver, or don’t have it at a discoverable path. Try:

julia> using Libdl

julia> Libdl.dlpath("libcuda")
"/usr/bin/../lib/libcuda.so"

Also, could you try the failing example again but launching julia with --startup-file=no to see if the failure to render that exception is due to some code you have loaded, and show the output of using Logging; show(global_logger()) when you run your script? Be sure to show exactly the code you are running, in which environment (e.g. a REPL, Juno, or whatnot); I’d like to know what’s causing the output issue :slightly_smiling_face:

1 Like

When I run

using Libdl
Libdl.dlpath("libcuda")

I get

"/usr/local/cuda/lib64/stubs/libcuda.so"

I start Julia by typing julia --startup-file=no. Then I execute using IJulia;jupyter-notebook().

I execute the example with additional commands. It still fails and produce the previous error. The new commands generate new info:

Base.CoreLogging.SimpleLogger(IJulia.IJuliaStdio{Base.PipeEndpoint}(IOContext(Base.PipeEndpoint(RawFD(0x0000002d) open, 0 bytes waiting))), Info, Dict{Any,Int64}())

Aha, so IJulia messes up the logging. Good to know.

Anyway, the dlpath output confirms my suspicion. You need to install the NVIDIA driver, and if it is, make sure its libcuda is discoverable.