Hi,
I’m trying to install CuArrays.jl
. Even though installation finishes without any errors, the tests are failing with the following error message:
CUDA error: no kernel image is available for execution on the device (code #209, ERROR_NO_BINARY_FOR_GPU)
Same error occurs when I try to test CUDAnative.jl
. However CUDAdrv.jl
passes all tests.
I even tried rebuilding,
(v1.1) pkg> build CuArrays
Building CUDAdrv ───→ `~/.julia/packages/CUDAdrv/3cR2F/deps/build.log`
Building LLVM ──────→ `~/.julia/packages/LLVM/tg8MX/deps/build.log`
Building CUDAnative → `~/.julia/packages/CUDAnative/wU0tS/deps/build.log`
Building Conda ─────→ `~/.julia/packages/Conda/CpuvI/deps/build.log`
Building FFTW ──────→ `~/.julia/packages/FFTW/p7sLQ/deps/build.log`
Building CuArrays ──→ `~/.julia/packages/CuArrays/PwSdF/deps/build.log`
but again the same error occurs during test.
System information is:
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.1 (ORCJIT, haswell)
Package status:
(v1.1) pkg> st
Status `~/.julia/environments/v1.1/Project.toml`
[3a865a2d] CuArrays v1.0.2
[438e738f] PyCall v1.91.2
Result of nvidia-smi
from terminal:
Sun May 12 09:13:54 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.87 Driver Version: 390.87 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GT 610 Off | 00000000:02:00.0 N/A | N/A |
| 56% 70C P0 N/A / N/A | 721MiB / 1985MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX TIT... Off | 00000000:03:00.0 Off | N/A |
| 36% 77C P8 22W / 250W | 117MiB / 12212MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
Which test fails? Please reduce that, if possible, and file an issue with some info on your device and set-up (model, compute capability, CUDA version, etc).
Probably some feature is being tested although it isn’t supported by your device. We currently don’t have a way to dispatch on that.
Lots of tests failed. I believe this is some kind of configuration issue.
Below is the test summary.
Test Summary: | Pass Fail Error Total
CuArrays | 706 12 917 1635
GPUArrays test suite | 526 366 892
construction | 382 4 386
constructors + similar | 180 180
comparison against Array | 119 119
conversion | 72 72
value constructors | 11 1 12
iterator constructors | 3 3
parallel execution interface | 1 1
indexing | 72 10 82
Indexing with Float32 | 32 2 34
multi dim, sliced setindex | 1 1
Indexing with Int32 | 32 2 34
multi dim, sliced setindex | 1 1
Indexing with Float32 | 1 1 2
Indexing with Int32 | 1 1 2
issue #42 with Float32 | 3 3
issue #42 with Int32 | 3 3
Colon() Float32 | 1 1
Colon() Int32 | 1 1
input/output | 5 5
base functionality | 4 16 20
copyto! | 1 1
vcat + hcat | 5 5
reinterpret | 2 2
ntuple test | 1 1
cartesian iteration | 1 1
Custom kernel from Julia function | 1 1
map | 3 3
repeat | 4 4
heuristics | 2 2
mapreduce | 51 162 213
Float32 | 8 26 34
mapreducedim | 18 18
sum maximum minimum prod | 8 8 16
Float64 | 8 26 34
mapreducedim | 18 18
sum maximum minimum prod | 8 8 16
Int32 | 8 26 34
mapreducedim | 18 18
sum maximum minimum prod | 8 8 16
Int64 | 8 26 34
mapreducedim | 18 18
sum maximum minimum prod | 8 8 16
Complex{Float32} | 4 22 26
mapreducedim | 18 18
sum maximum minimum prod | 4 4 8
Complex{Float64} | 4 22 26
mapreducedim | 18 18
sum maximum minimum prod | 4 4 8
any all == | 9 12 21
isapprox | 2 2 4
broadcast | 2 105 107
broadcast Float32 | 1 20 21
RefValue | 1 1
Tuple | 1 1
Adjoint and Transpose | 1 1
broadcast Float64 | 21 21
RefValue | 1 1
Tuple | 1 1
Adjoint and Transpose | 1 1
broadcast Int32 | 1 15 16
RefValue | 1 1
Tuple | 1 1
Adjoint and Transpose | 1 1
broadcast Int64 | 16 16
RefValue | 1 1
Tuple | 1 1
Adjoint and Transpose | 1 1
broadcast Complex{Float32} | 16 16
RefValue | 1 1
Tuple | 1 1
Adjoint and Transpose | 1 1
broadcast Complex{Float64} | 16 16
RefValue | 1 1
Tuple | 1 1
Adjoint and Transpose | 1 1
vec 3 | 1 1
linear algebra | 10 4 14
transpose | 2 2
permutedims | 4 4
issymmetric/ishermitian | 8 8
FFT with ND = 1 | 4 4
FFT with ND = 2 | 4 4
FFT with ND = 3 | 4 4
BLAS | 51 51
matmul with element type Float32 | 12 12
matmul with element type Float64 | 12 12
matmul with element type Complex{Float32} | 12 12
matmul with element type Complex{Float64} | 12 12
rmul! Complex{Float32} | 1 1
rmul! Float32 | 1 1
gbmv | 1 1
Random | 1 1
rand | 1 1
Memory | 5 5
Array | 20 2 22
Adapt | 1 1 2
Broadcast | 10 10
Cufunc | 3 3 6
Ref Broadcast | 1 1
Broadcast Fix | 4 4
Reduce | 2 4 6
0D | 1 1
Slices | 15 2 17
Reshape | 1 1
LinearAlgebra.triu! with diagonal -2 | 1 1
LinearAlgebra.triu! with diagonal -1 | 1 1
LinearAlgebra.triu! with diagonal 0 | 1 1
LinearAlgebra.triu! with diagonal 1 | 1 1
LinearAlgebra.triu! with diagonal 2 | 1 1
LinearAlgebra.tril! with diagonal -2 | 1 1
LinearAlgebra.tril! with diagonal -1 | 1 1
LinearAlgebra.tril! with diagonal 0 | 1 1
LinearAlgebra.tril! with diagonal 1 | 1 1
LinearAlgebra.tril! with diagonal 2 | 1 1
Utilities | 2 2
accumulate | 1 7 8
logical indexing | 15 15
generic fallbacks | 1 1
CUDNN | 59 11 70
NNlib | 59 8 67
Activations and Other Ops | 3 3
CUBLAS | 4 1 5
CUSPARSE | 66 8 359 433
util | 7 1 8
char | 15 15
conversion | 8 52 60
elty = Float32 | 2 13 15
make_csc | 2 2
make_csr | 1 1
convert_r2c | 1 1
convert_r2b | 1 1
convert_c2b | 1 1
convert_c2h | 1 1
convert_r2h | 1 1
convert_d2h | 1 1
convert_d2b | 1 1
convert_c2r | 1 1
convert_r2d | 1 1
convert_c2d | 1 1
convert_d2c | 1 1
convert_d2r | 1 1
elty = Float64 | 2 13 15
make_csc | 2 2
make_csr | 1 1
convert_r2c | 1 1
convert_r2b | 1 1
convert_c2b | 1 1
convert_c2h | 1 1
convert_r2h | 1 1
convert_d2h | 1 1
convert_d2b | 1 1
convert_c2r | 1 1
convert_r2d | 1 1
convert_c2d | 1 1
convert_d2c | 1 1
convert_d2r | 1 1
elty = Complex{Float32} | 2 13 15
make_csc | 2 2
make_csr | 1 1
convert_r2c | 1 1
convert_r2b | 1 1
convert_c2b | 1 1
convert_c2h | 1 1
convert_r2h | 1 1
convert_d2h | 1 1
convert_d2b | 1 1
convert_c2r | 1 1
convert_r2d | 1 1
convert_c2d | 1 1
convert_d2c | 1 1
convert_d2r | 1 1
elty = Complex{Float64} | 2 13 15
make_csc | 2 2
make_csr | 1 1
convert_r2c | 1 1
convert_r2b | 1 1
convert_c2b | 1 1
convert_c2h | 1 1
convert_r2h | 1 1
convert_d2h | 1 1
convert_d2b | 1 1
convert_c2r | 1 1
convert_r2d | 1 1
convert_c2d | 1 1
convert_d2c | 1 1
convert_d2r | 1 1
bsric02 | 8 8
elty = Float32 | 2 2
bsric02! | 1 1
bsric02 | 1 1
elty = Float64 | 2 2
bsric02! | 1 1
bsric02 | 1 1
elty = Complex{Float32} | 2 2
bsric02! | 1 1
bsric02 | 1 1
elty = Complex{Float64} | 2 2
bsric02! | 1 1
bsric02 | 1 1
bsrilu02 | 8 8
elty = Float32 | 2 2
bsrilu02! | 1 1
bsrilu02 | 1 1
elty = Float64 | 2 2
bsrilu02! | 1 1
bsrilu02 | 1 1
elty = Complex{Float32} | 2 2
bsrilu02! | 1 1
bsrilu02 | 1 1
elty = Complex{Float64} | 2 2
bsrilu02! | 1 1
bsrilu02 | 1 1
bsrsm2 | 8 8
elty = Float32 | 2 2
bsrsm2! | 1 1
bsrsm2 | 1 1
elty = Float64 | 2 2
bsrsm2! | 1 1
bsrsm2 | 1 1
elty = Complex{Float32} | 2 2
bsrsm2! | 1 1
bsrsm2 | 1 1
elty = Complex{Float64} | 2 2
bsrsm2! | 1 1
bsrsm2 | 1 1
bsrsv2 | 8 8
elty = Float32 | 2 2
bsrsv2! | 1 1
bsrsv2 | 1 1
elty = Float64 | 2 2
bsrsv2! | 1 1
bsrsv2 | 1 1
elty = Complex{Float32} | 2 2
bsrsv2! | 1 1
bsrsv2 | 1 1
elty = Complex{Float64} | 2 2
bsrsv2! | 1 1
bsrsv2 | 1 1
ilu0 | 8 8
elty = Float32 | 2 2
csr | 1 1
csc | 1 1
elty = Float64 | 2 2
csr | 1 1
csc | 1 1
elty = Complex{Float32} | 2 2
csr | 1 1
csc | 1 1
elty = Complex{Float64} | 2 2
csr | 1 1
csc | 1 1
ilu02 | 8 8
elty = Float32 | 2 2
csr | 1 1
csc | 1 1
elty = Float64 | 2 2
csr | 1 1
csc | 1 1
elty = Complex{Float32} | 2 2
csr | 1 1
csc | 1 1
elty = Complex{Float64} | 2 2
csr | 1 1
csc | 1 1
ic0 | 8 8
elty = Float32 | 2 2
csr | 1 1
csc | 1 1
elty = Float64 | 2 2
csr | 1 1
csc | 1 1
elty = Complex{Float32} | 2 2
csr | 1 1
.........................
skipped due to discourse limits.
........................
csreigs | 1 1
csrlsqvqr! | 1 1
ERROR: LoadError: Some tests did not pass: 706 passed, 12 failed, 917 errored, 0 broken.
in expression starting at /home/kach/.julia/packages/CuArrays/PwSdF/test/runtests.jl:17
ERROR: Package CuArrays errored during testing
Not necessarily. CUDA errors are sticky, so the same one keeps popping up. If it also fails with CUDAnative, please post that test output / try to figure out which test is the first to fail.
Hi @maleadt
The complete test log for CUDAnative.jl is available at https://gist.github.com/v-i-s-h/26b95fffa232d3c3bf80784b57a1b40a
All the errors are due to the same condition:
CUDA error: no kernel image is available for execution on the device (code #209, ERROR_NO_BINARY_FOR_GPU)
1 Like