CUDA/Flux installation error ubuntu 20, cuda toolkit 11, julia 1.5.2

Hi

I have updated ubuntu t0 20.04, julia to 1.5.2 and cuda toolkit to 111, and I have trouble with both CUDA and Flux. They both get installed but do not pass test or compile. I removed one tried the other, changed the order of updating them etc but can not get them to work. Any suggestions?

Errors are below.

ps. I did try add CUDAv@2.x.x (with 1, 2 0 for x x ) but that did not install at all.

Thanks
Nima

(@v1.5) pkg> add CUDA
Updating registry at ~/.julia/registries/General
Resolving package versions…
Updating ~/.julia/environments/v1.5/Project.toml
[052768ef] + CUDA v0.1.0
Updating ~/.julia/environments/v1.5/Manifest.toml
[fa961155] ↓ CEnum v0.3.0 ⇒ v0.2.0
[052768ef] + CUDA v0.1.0
[be33ccc6] ↓ CUDAnative v3.2.0 ⇒ v3.0.4
[61eb1bfa] ↑ GPUCompiler v0.2.0 ⇒ v0.3.0

(@v1.5) pkg> test CUDA
Testing CUDA
Status /tmp/jl_253Iww/Project.toml
[621f4979] AbstractFFTs v0.5.0
[79e6a3ab] Adapt v1.1.0
[b99e7846] BinaryProvider v0.5.10
[fa961155] CEnum v0.2.0
[052768ef] CUDA v0.1.0
[864edb3b] DataStructures v0.17.20
[e2ba6199] ExprTools v0.1.3
[7a1cc6ca] FFTW v1.2.4
[1a297f60] FillArrays v0.8.14
[f6369f11] ForwardDiff v0.10.12
[0c68f7d7] GPUArrays v3.4.1
[61eb1bfa] GPUCompiler v0.3.0
[929cbde3] LLVM v1.7.0
[1914dd2f] MacroTools v0.5.6
[872c559c] NNlib v0.6.6
[189a3867] Reexport v0.2.0
[ae029012] Requires v1.1.0
[a759f4b9] TimerOutputs v0.5.7
[ade2ca70] Dates
[8ba89e20] Distributed
[8f399da3] Libdl
[37e2e46d] LinearAlgebra
[56ddb016] Logging
[44cfe95a] Pkg
[de0858da] Printf
[3fa0cd96] REPL
[9a3f8284] Random
[2f01184e] SparseArrays
[10745b16] Statistics
[8dfed614] Test
Status /tmp/jl_253Iww/Manifest.toml
[621f4979] AbstractFFTs v0.5.0
[79e6a3ab] Adapt v1.1.0
[56f22d72] Artifacts v1.3.0
[b99e7846] BinaryProvider v0.5.10
[fa961155] CEnum v0.2.0
[052768ef] CUDA v0.1.0
[bbf7d656] CommonSubexpressions v0.3.0
[e66e0078] CompilerSupportLibraries_jll v0.3.4+0
[864edb3b] DataStructures v0.17.20
[163ba53b] DiffResults v1.0.2
[b552c78f] DiffRules v1.0.1
[e2ba6199] ExprTools v0.1.3
[7a1cc6ca] FFTW v1.2.4
[f5851436] FFTW_jll v3.3.9+6
[1a297f60] FillArrays v0.8.14
[f6369f11] ForwardDiff v0.10.12
[0c68f7d7] GPUArrays v3.4.1
[61eb1bfa] GPUCompiler v0.3.0
[1d5cc7b8] IntelOpenMP_jll v2018.0.3+0
[692b3bcd] JLLWrappers v1.1.3
[929cbde3] LLVM v1.7.0
[856f044c] MKL_jll v2020.2.254+0
[1914dd2f] MacroTools v0.5.6
[872c559c] NNlib v0.6.6
[77ba4419] NaNMath v0.3.4
[efe28fd5] OpenSpecFun_jll v0.5.3+4
[bac558e1] OrderedCollections v1.3.2
[189a3867] Reexport v0.2.0
[ae029012] Requires v1.1.0
[276daf66] SpecialFunctions v0.10.3
[90137ffa] StaticArrays v0.12.5
[a759f4b9] TimerOutputs v0.5.7
[2a0f44e3] Base64
[ade2ca70] Dates
[8ba89e20] Distributed
[b77e0a4c] InteractiveUtils
[76f85450] LibGit2
[8f399da3] Libdl
[37e2e46d] LinearAlgebra
[56ddb016] Logging
[d6f4376e] Markdown
[44cfe95a] Pkg
[de0858da] Printf
[3fa0cd96] REPL
[9a3f8284] Random
[ea8e919c] SHA
[9e88b42a] Serialization
[6462fe0b] Sockets
[2f01184e] SparseArrays
[10745b16] Statistics
[8dfed614] Test
[cf7118a7] UUIDs
[4ec0a83e] Unicode
ERROR: LoadError: LoadError: LoadError: UndefVarError: AddrSpacePtr not defined
Stacktrace:
[1] getproperty(::Module, ::Symbol) at ./Base.jl:26
[2] top-level scope at /home/nima/.julia/packages/CUDA/5t6R9/src/device/cuda/wmma.jl:52
[3] include(::Function, ::Module, ::String) at ./Base.jl:380
[4] include at ./Base.jl:368 [inlined]
[5] include(::String) at /home/nima/.julia/packages/CUDA/5t6R9/src/CUDA.jl:1
[6] top-level scope at /home/nima/.julia/packages/CUDA/5t6R9/src/device/cuda.jl:15
[7] include(::Function, ::Module, ::String) at ./Base.jl:380
[8] include at ./Base.jl:368 [inlined]
[9] include(::String) at /home/nima/.julia/packages/CUDA/5t6R9/src/CUDA.jl:1
[10] top-level scope at /home/nima/.julia/packages/CUDA/5t6R9/src/CUDA.jl:39
[11] include(::Function, ::Module, ::String) at ./Base.jl:380
[12] include(::Module, ::String) at ./Base.jl:368
[13] top-level scope at none:2
[14] eval at ./boot.jl:331 [inlined]
[15] eval(::Expr) at ./client.jl:467
[16] top-level scope at ./none:3
in expression starting at /home/nima/.julia/packages/CUDA/5t6R9/src/device/cuda/wmma.jl:52
in expression starting at /home/nima/.julia/packages/CUDA/5t6R9/src/device/cuda.jl:14
in expression starting at /home/nima/.julia/packages/CUDA/5t6R9/src/CUDA.jl:39
ERROR: LoadError: LoadError: Failed to precompile CUDA [052768ef-5323-5732-b1bb-66c8b64840ba] to /home/nima/.julia/compiled/v1.5/CUDA/oWw5k_t3Fxp.ji.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1305
[3] _require(::Base.PkgId) at ./loading.jl:1030
[4] require(::Base.PkgId) at ./loading.jl:928
[5] require(::Module, ::Symbol) at ./loading.jl:923
[6] include(::String) at ./client.jl:457
[7] top-level scope at /home/nima/.julia/packages/CUDA/5t6R9/test/runtests.jl:42
[8] include(::String) at ./client.jl:457
[9] top-level scope at none:6
in expression starting at /home/nima/.julia/packages/CUDA/5t6R9/test/setup.jl:1
in expression starting at /home/nima/.julia/packages/CUDA/5t6R9/test/runtests.jl:42
ERROR: Package CUDA errored during testing

(@v1.5) pkg> add Flux,Zygote
Resolving package versions…
Updating ~/.julia/environments/v1.5/Project.toml
[587475ba] + Flux v0.10.4
[e88e6eb3] + Zygote v0.4.22
No Changes to ~/.julia/environments/v1.5/Manifest.toml

(@v1.5) pkg> build Flux
Building NNlib → ~/.julia/packages/NNlib/FAI3o/deps/build.log

(@v1.5) pkg> test Flux
Testing Flux
Status /tmp/jl_RR0wDx/Project.toml
[1520ce14] AbstractTrees v0.3.3
[79e6a3ab] Adapt v1.1.0
[944b1d66] CodecZlib v0.7.0
[5ae59095] Colors v0.12.4
[3a865a2d] CuArrays v2.2.2
[e30172f5] Documenter v0.25.3
[587475ba] Flux v0.10.4
[c8e1da08] IterTools v1.3.0
[e5e0dc1b] Juno v0.8.4
[1914dd2f] MacroTools v0.5.6
[872c559c] NNlib v0.6.6
[189a3867] Reexport v0.2.0
[2913bbd2] StatsBase v0.33.2
[a5390f91] ZipFile v0.9.3
[e88e6eb3] Zygote v0.4.22
[8bb1440f] DelimitedFiles
[37e2e46d] LinearAlgebra
[44cfe95a] Pkg
[de0858da] Printf
[9a3f8284] Random
[ea8e919c] SHA
[10745b16] Statistics
[8dfed614] Test
Status /tmp/jl_RR0wDx/Manifest.toml
[621f4979] AbstractFFTs v0.5.0
[1520ce14] AbstractTrees v0.3.3
[79e6a3ab] Adapt v1.1.0
[4c555306] ArrayLayouts v0.3.8
[56f22d72] Artifacts v1.3.0
[b99e7846] BinaryProvider v0.5.10
[fa961155] CEnum v0.2.0
[3895d2a7] CUDAapi v4.0.0
[c5f51814] CUDAdrv v6.3.0
[be33ccc6] CUDAnative v3.0.4
[082447d4] ChainRules v0.6.5
[d360d2e6] ChainRulesCore v0.8.1
[da1fd8a2] CodeTracking v0.5.12
[944b1d66] CodecZlib v0.7.0
[3da002f7] ColorTypes v0.10.9
[5ae59095] Colors v0.12.4
[bbf7d656] CommonSubexpressions v0.3.0
[e66e0078] CompilerSupportLibraries_jll v0.3.4+0
[f68482b8] Cthulhu v1.3.0
[3a865a2d] CuArrays v2.2.2
[9a962f9c] DataAPI v1.4.0
[864edb3b] DataStructures v0.17.20
[163ba53b] DiffResults v1.0.2
[b552c78f] DiffRules v1.0.1
[ffbed154] DocStringExtensions v0.8.3
[e30172f5] Documenter v0.25.3
[e2ba6199] ExprTools v0.1.3
[1a297f60] FillArrays v0.8.14
[53c48c17] FixedPointNumbers v0.8.4
[587475ba] Flux v0.10.4
[1eca21be] FoldingTrees v1.0.1
[f6369f11] ForwardDiff v0.10.12
[0c68f7d7] GPUArrays v3.4.1
[b5f81e59] IOCapture v0.1.1
[7869d1d1] IRTools v0.4.1
[c8e1da08] IterTools v1.3.0
[692b3bcd] JLLWrappers v1.1.3
[682c06a0] JSON v0.21.1
[e5e0dc1b] Juno v0.8.4
[929cbde3] LLVM v1.7.0
[1914dd2f] MacroTools v0.5.6
[e89f7d12] Media v0.5.0
[e1d29d7a] Missings v0.4.4
[46d2c3a1] MuladdMacro v0.2.2
[872c559c] NNlib v0.6.6
[77ba4419] NaNMath v0.3.4
[efe28fd5] OpenSpecFun_jll v0.5.3+4
[bac558e1] OrderedCollections v1.3.2
[69de0a69] Parsers v1.0.11
[189a3867] Reexport v0.2.0
[ae029012] Requires v1.1.0
[a2af1166] SortingAlgorithms v0.3.1
[276daf66] SpecialFunctions v0.10.3
[90137ffa] StaticArrays v0.12.5
[2913bbd2] StatsBase v0.33.2
[a759f4b9] TimerOutputs v0.5.7
[3bb67fe8] TranscodingStreams v0.9.5
[a5390f91] ZipFile v0.9.3
[83775a58] Zlib_jll v1.2.11+18
[e88e6eb3] Zygote v0.4.22
[700de1a5] ZygoteRules v0.2.0
[2a0f44e3] Base64
[ade2ca70] Dates
[8bb1440f] DelimitedFiles
[8ba89e20] Distributed
[9fa8497b] Future
[b77e0a4c] InteractiveUtils
[76f85450] LibGit2
[8f399da3] Libdl
[37e2e46d] LinearAlgebra
[56ddb016] Logging
[d6f4376e] Markdown
[a63ad114] Mmap
[44cfe95a] Pkg
[de0858da] Printf
[9abbd945] Profile
[3fa0cd96] REPL
[9a3f8284] Random
[ea8e919c] SHA
[9e88b42a] Serialization
[6462fe0b] Sockets
[2f01184e] SparseArrays
[10745b16] Statistics
[8dfed614] Test
[cf7118a7] UUIDs
[4ec0a83e] Unicode
ERROR: LoadError: LoadError: LoadError: UndefVarError: AddrSpacePtr not defined
Stacktrace:
[1] getproperty(::Module, ::Symbol) at ./Base.jl:26
[2] top-level scope at /home/nima/.julia/packages/CUDAnative/ierw8/src/device/cuda/wmma.jl:56
[3] include(::Function, ::Module, ::String) at ./Base.jl:380
[4] include at ./Base.jl:368 [inlined]
[5] include(::String) at /home/nima/.julia/packages/CUDAnative/ierw8/src/CUDAnative.jl:1
[6] top-level scope at /home/nima/.julia/packages/CUDAnative/ierw8/src/device/cuda.jl:14
[7] include(::Function, ::Module, ::String) at ./Base.jl:380
[8] include at ./Base.jl:368 [inlined]
[9] include(::String) at /home/nima/.julia/packages/CUDAnative/ierw8/src/CUDAnative.jl:1
[10] top-level scope at /home/nima/.julia/packages/CUDAnative/ierw8/src/CUDAnative.jl:70
[11] include(::Function, ::Module, ::String) at ./Base.jl:380
[12] include(::Module, ::String) at ./Base.jl:368
[13] top-level scope at none:2
[14] eval at ./boot.jl:331 [inlined]
[15] eval(::Expr) at ./client.jl:467
[16] top-level scope at ./none:3
in expression starting at /home/nima/.julia/packages/CUDAnative/ierw8/src/device/cuda/wmma.jl:55
in expression starting at /home/nima/.julia/packages/CUDAnative/ierw8/src/device/cuda.jl:14
in expression starting at /home/nima/.julia/packages/CUDAnative/ierw8/src/CUDAnative.jl:70
ERROR: LoadError: Failed to precompile CUDAnative [be33ccc6-a3ff-5ff2-a52e-74243cff1e17] to /home/nima/.julia/compiled/v1.5/CUDAnative/4Zu2W_92tlf.ji.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1305
[3] _require(::Base.PkgId) at ./loading.jl:1030
[4] require(::Base.PkgId) at ./loading.jl:928
[5] require(::Module, ::Symbol) at ./loading.jl:923
[6] include(::Function, ::Module, ::String) at ./Base.jl:380
[7] include(::Module, ::String) at ./Base.jl:368
[8] top-level scope at none:2
[9] eval at ./boot.jl:331 [inlined]
[10] eval(::Expr) at ./client.jl:467
[11] top-level scope at ./none:3
in expression starting at /home/nima/.julia/packages/CuArrays/YFdj7/src/CuArrays.jl:3
ERROR: LoadError: Failed to precompile CuArrays [3a865a2d-5b23-5a0f-bc46-62713ec82fae] to /home/nima/.julia/compiled/v1.5/CuArrays/7YFE0_92tlf.ji.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1305
[3] _require(::Base.PkgId) at ./loading.jl:1030
[4] require(::Base.PkgId) at ./loading.jl:928
[5] require(::Module, ::Symbol) at ./loading.jl:923
[6] include(::Function, ::Module, ::String) at ./Base.jl:380
[7] include(::Module, ::String) at ./Base.jl:368
[8] top-level scope at none:2
[9] eval at ./boot.jl:331 [inlined]
[10] eval(::Expr) at ./client.jl:467
[11] top-level scope at ./none:3
in expression starting at /home/nima/.julia/packages/Flux/Fj3bt/src/Flux.jl:26
ERROR: LoadError: Failed to precompile Flux [587475ba-b771-5e3f-ad9e-33799f191a9c] to /home/nima/.julia/compiled/v1.5/Flux/QdkVy_92tlf.ji.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1305
[3] _require(::Base.PkgId) at ./loading.jl:1030
[4] require(::Base.PkgId) at ./loading.jl:928
[5] require(::Module, ::Symbol) at ./loading.jl:923
[6] include(::String) at ./client.jl:457
[7] top-level scope at none:6
in expression starting at /home/nima/.julia/packages/Flux/Fj3bt/test/runtests.jl:1
ERROR: Package Flux errored during testing

At first sight, it seems like to other packages present have conflicting requirements given that on the add CUDA you end up with [052768ef] + CUDA v0.1.0. This is a old version of CUDA. Could you try installing CUDA in a new environment / package?

Thanks!
It worked (i.e. the installation went forward and I can run it “using CUDA” and tested some stuff.
Though there are some tests that didn’t pass (for both CUDA and Flux). I document them below if it helps others to use it or developers to fix it.

For CUDA, “cusolver” fails. For Flux, there are 3 failures and the source of the error is in “curnn.jl”

julia> mkdir(“gpu”)
“gpu”

julia> cd(“gpu”)

(@v1.5) pkg> activate .
Activating new environment at ~/Julia/gpu/Project.toml

(gpu) pkg> st
Status ~/Julia/gpu/Project.toml (empty project)

(gpu) pkg> add CUDA
Resolving package versions…
Installed Adapt ────────── v2.3.0
Installed BFloat16s ────── v0.1.0
Installed Scratch ──────── v1.0.3
Installed DataStructures ─ v0.18.8
Installed NNlib ────────── v0.7.6
Installed GPUArrays ────── v6.1.1
Installed LLVM ─────────── v3.3.0
Installed CUDA ─────────── v2.1.0
Updating ~/Julia/gpu/Project.toml
[052768ef] + CUDA v2.1.0
Updating ~/Julia/gpu/Manifest.toml
[621f4979] + AbstractFFTs v0.5.0
[79e6a3ab] + Adapt v2.3.0
[ab4f0b2a] + BFloat16s v0.1.0
[fa961155] + CEnum v0.4.1
[052768ef] + CUDA v2.1.0
[34da2185] + Compat v3.23.0
[864edb3b] + DataStructures v0.18.8
[e2ba6199] + ExprTools v0.1.3
[0c68f7d7] + GPUArrays v6.1.1
[61eb1bfa] + GPUCompiler v0.8.3
[929cbde3] + LLVM v3.3.0
[1914dd2f] + MacroTools v0.5.6
[872c559c] + NNlib v0.7.6
[bac558e1] + OrderedCollections v1.3.2
[189a3867] + Reexport v0.2.0
[ae029012] + Requires v1.1.0
[6c6a2e73] + Scratch v1.0.3
[a759f4b9] + TimerOutputs v0.5.7
[2a0f44e3] + Base64
[ade2ca70] + Dates
[8bb1440f] + DelimitedFiles
[8ba89e20] + Distributed
[b77e0a4c] + InteractiveUtils
[76f85450] + LibGit2
[8f399da3] + Libdl
[37e2e46d] + LinearAlgebra
[56ddb016] + Logging
[d6f4376e] + Markdown
[a63ad114] + Mmap
[44cfe95a] + Pkg
[de0858da] + Printf
[3fa0cd96] + REPL
[9a3f8284] + Random
[ea8e919c] + SHA
[9e88b42a] + Serialization
[1a1011a3] + SharedArrays
[6462fe0b] + Sockets
[2f01184e] + SparseArrays
[10745b16] + Statistics
[8dfed614] + Test
[cf7118a7] + UUIDs
[4ec0a83e] + Unicode

(gpu) pkg> test CUDA
Testing CUDA
Status /tmp/jl_BRECix/Project.toml
[621f4979] AbstractFFTs v0.5.0
[79e6a3ab] Adapt v2.3.0
[ab4f0b2a] BFloat16s v0.1.0
[fa961155] CEnum v0.4.1
[052768ef] CUDA v2.1.0
[864edb3b] DataStructures v0.18.8
[e2ba6199] ExprTools v0.1.3
[7a1cc6ca] FFTW v1.2.4
[1a297f60] FillArrays v0.10.0
[f6369f11] ForwardDiff v0.10.12
[0c68f7d7] GPUArrays v6.1.1
[61eb1bfa] GPUCompiler v0.8.3
[a98d9a8b] Interpolations v0.13.0
[929cbde3] LLVM v3.3.0
[1914dd2f] MacroTools v0.5.6
[872c559c] NNlib v0.7.6
[189a3867] Reexport v0.2.0
[ae029012] Requires v1.1.0
[a759f4b9] TimerOutputs v0.5.7
[ade2ca70] Dates
[8ba89e20] Distributed
[8f399da3] Libdl
[37e2e46d] LinearAlgebra
[56ddb016] Logging
[44cfe95a] Pkg
[de0858da] Printf
[3fa0cd96] REPL
[9a3f8284] Random
[2f01184e] SparseArrays
[10745b16] Statistics
[8dfed614] Test
Status /tmp/jl_BRECix/Manifest.toml
[621f4979] AbstractFFTs v0.5.0
[79e6a3ab] Adapt v2.3.0
[56f22d72] Artifacts v1.3.0
[13072b0f] AxisAlgorithms v1.0.0
[ab4f0b2a] BFloat16s v0.1.0
[fa961155] CEnum v0.4.1
[052768ef] CUDA v2.1.0
[bbf7d656] CommonSubexpressions v0.3.0
[34da2185] Compat v3.23.0
[e66e0078] CompilerSupportLibraries_jll v0.3.4+0
[864edb3b] DataStructures v0.18.8
[163ba53b] DiffResults v1.0.2
[b552c78f] DiffRules v1.0.1
[e2ba6199] ExprTools v0.1.3
[7a1cc6ca] FFTW v1.2.4
[f5851436] FFTW_jll v3.3.9+6
[1a297f60] FillArrays v0.10.0
[f6369f11] ForwardDiff v0.10.12
[0c68f7d7] GPUArrays v6.1.1
[61eb1bfa] GPUCompiler v0.8.3
[1d5cc7b8] IntelOpenMP_jll v2018.0.3+0
[a98d9a8b] Interpolations v0.13.0
[692b3bcd] JLLWrappers v1.1.3
[929cbde3] LLVM v3.3.0
[856f044c] MKL_jll v2020.2.254+0
[1914dd2f] MacroTools v0.5.6
[872c559c] NNlib v0.7.6
[77ba4419] NaNMath v0.3.4
[6fe1bfb0] OffsetArrays v1.4.0
[efe28fd5] OpenSpecFun_jll v0.5.3+4
[bac558e1] OrderedCollections v1.3.2
[c84ed2f1] Ratios v0.4.0
[189a3867] Reexport v0.2.0
[ae029012] Requires v1.1.0
[6c6a2e73] Scratch v1.0.3
[276daf66] SpecialFunctions v0.10.3
[90137ffa] StaticArrays v0.12.5
[a759f4b9] TimerOutputs v0.5.7
[efce3f68] WoodburyMatrices v0.5.3
[2a0f44e3] Base64
[ade2ca70] Dates
[8bb1440f] DelimitedFiles
[8ba89e20] Distributed
[b77e0a4c] InteractiveUtils
[76f85450] LibGit2
[8f399da3] Libdl
[37e2e46d] LinearAlgebra
[56ddb016] Logging
[d6f4376e] Markdown
[a63ad114] Mmap
[44cfe95a] Pkg
[de0858da] Printf
[3fa0cd96] REPL
[9a3f8284] Random
[ea8e919c] SHA
[9e88b42a] Serialization
[1a1011a3] SharedArrays
[6462fe0b] Sockets
[2f01184e] SparseArrays
[10745b16] Statistics
[8dfed614] Test
[cf7118a7] UUIDs
[4ec0a83e] Unicode
Downloading artifact: CUDA111
Downloading artifact: CUDNN_CUDA111
Downloading artifact: CUTENSOR_CUDA111
┌ Info: System information:
│ CUDA toolkit 11.1.0, artifact installation
│ CUDA driver 11.1.0
│ NVIDIA driver 455.32.0

│ Libraries:
│ - CUBLAS: 11.3.0
│ - CURAND: 10.2.2
│ - CUFFT: 10.3.0
│ - CUSOLVER: 11.0.0
│ - CUSPARSE: 11.2.0
│ - CUPTI: 14.0.0
│ - NVML: 11.0.0+455.32.0
│ - CUDNN: 8.0.4 (for CUDA 11.1.0)
│ - CUTENSOR: 1.2.1 (for CUDA 11.1.0)

│ Toolchain:
│ - Julia: 1.5.2
│ - LLVM: 9.0.1
│ - PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4
│ - Device support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75

│ 1 device:
└ 0: Graphics Device (sm_86, 6.687 GiB / 7.795 GiB available)
[ Info: Testing using 1 device(s): 1. Graphics Device (UUID 0009a198-10ce-4a24-638b-2049395bc18b)
| | ---------------- GPU ---------------- | ---------------- CPU ---------------- |
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB) | GC (s) | GC % | Alloc (MB) | RSS (MB) |
initialization (2) | 6.44 | 0.00 | 0.0 | 0.00 | 151.00 | 0.20 | 3.0 | 484.08 | 2035.12 |
apiutils (3) | 1.89 | 0.00 | 0.0 | 0.00 | 151.00 | 0.12 | 6.5 | 307.80 | 2035.12 |
curand (10) | 1.96 | 0.00 | 0.0 | 0.00 | 159.00 | 0.14 | 7.1 | 329.06 | 2035.12 |
codegen (6) | 21.51 | 0.05 | 0.3 | 0.00 | 179.00 | 0.66 | 3.1 | 1386.58 | 2035.12 |
iterator (6) | 3.54 | 0.06 | 1.8 | 1.07 | 161.00 | 0.07 | 2.1 | 248.52 | 2035.12 |
nnlib (6) | 5.49 | 0.00 | 0.0 | 4.00 | 443.00 | 0.09 | 1.7 | 330.23 | 2533.69 |
nvml (6) | 0.76 | 0.00 | 0.0 | 0.00 | 159.00 | 0.01 | 1.6 | 42.61 | 2533.70 |
nvtx (6) | 0.64 | 0.00 | 0.0 | 0.00 | 159.00 | 0.01 | 2.0 | 62.90 | 2533.70 |
pointer (6) | 0.36 | 0.00 | 0.1 | 0.00 | 161.00 | 0.00 | 0.0 | 11.94 | 2533.70 |
pool (6) | 3.37 | 0.00 | 0.0 | 0.00 | 159.00 | 0.35 | 10.5 | 178.95 | 2533.70 |
broadcast (5) | 51.67 | 0.09 | 0.2 | 0.00 | 171.00 | 1.39 | 2.7 | 3468.98 | 2035.12 |
random (6) | 13.20 | 0.00 | 0.0 | 0.02 | 183.00 | 0.37 | 2.8 | 1152.28 | 2542.42 |
cufft (9) | 62.46 | 0.10 | 0.2 | 155.26 | 539.00 | 1.96 | 3.1 | 4228.38 | 2248.18 |
threading (9) | 5.30 | 0.00 | 0.1 | 18.94 | 451.00 | 0.17 | 3.2 | 424.13 | 2248.18 |
execution (3) | 67.15 | 0.06 | 0.1 | 0.56 | 269.00 | 1.61 | 2.4 | 5055.22 | 2035.12 |
cudadrv/context (3) | 0.93 | 0.00 | 0.0 | 0.00 | 159.00 | 0.00 | 0.0 | 16.70 | 2035.12 |
statistics (5) | 19.20 | 0.00 | 0.0 | 0.00 | 171.00 | 0.94 | 4.9 | 1616.03 | 2035.12 |
utils (9) | 2.04 | 0.00 | 0.0 | 4.00 | 419.00 | 0.03 | 1.5 | 102.88 | 2248.18 |
cudadrv/devices (3) | 0.45 | 0.00 | 0.0 | 0.00 | 159.00 | 0.01 | 3.3 | 28.71 | 2035.12 |
cudadrv/events (9) | 0.19 | 0.00 | 0.0 | 0.00 | 151.00 | 0.00 | 0.0 | 8.07 | 2248.18 |
cudadrv/errors (5) | 0.28 | 0.00 | 0.0 | 0.00 | 151.00 | 0.01 | 4.8 | 17.85 | 2035.12 |
cudadrv/module (5) | 0.99 | 0.00 | 0.0 | 0.00 | 151.00 | 0.01 | 1.2 | 42.12 | 2035.12 |
cudadrv/execution (3) | 1.03 | 0.00 | 0.1 | 0.00 | 161.00 | 0.01 | 1.3 | 57.54 | 2035.12 |
cudadrv/occupancy (5) | 0.19 | 0.00 | 0.0 | 0.00 | 151.00 | 0.00 | 0.0 | 7.26 | 2035.12 |
cudadrv/profile (3) | 0.36 | 0.00 | 0.0 | 0.00 | 159.00 | 0.00 | 0.0 | 41.82 | 2035.12 |
cudadrv/stream (5) | 0.26 | 0.00 | 0.0 | 0.00 | 151.00 | 0.00 | 0.0 | 14.24 | 2035.12 |
cudadrv/version (3) | 0.01 | 0.00 | 0.0 | 0.00 | 159.00 | 0.00 | 0.0 | 0.07 | 2035.12 |
cudadrv/memory (9) | 2.86 | 0.00 | 0.0 | 0.00 | 153.00 | 0.07 | 2.4 | 179.41 | 2248.18 |
cutensor/base (3) | 0.42 | 0.07 | 16.6 | 0.08 | 161.00 | 0.00 | 0.0 | 19.46 | 2035.12 |
cusparse (12) | 85.23 | 0.10 | 0.1 | 8.83 | 525.00 | 2.80 | 3.3 | 6059.02 | 2376.79 |
array (4) | 95.28 | 0.09 | 0.1 | 5.20 | 179.00 | 3.10 | 3.3 | 7798.46 | 2035.12 |
cudnn (8) | 101.62 | 0.10 | 0.1 | 16.00 | 715.00 | 3.36 | 3.3 | 7442.96 | 2953.58 |
cusolver/cusparse (5) | 28.50 | 0.00 | 0.0 | 0.19 | 607.00 | 0.89 | 3.1 | 2148.43 | 2568.28 |
device/array (5) | 3.49 | 0.00 | 0.0 | 0.00 | 171.00 | 0.09 | 2.5 | 270.64 | 2568.28 |
cusolver (11) | failed at 2020-11-08T19:43:04.017
cutensor/permutations (4) | 20.04 | 0.00 | 0.0 | 1.83 | 431.00 | 0.56 | 2.8 | 1448.30 | 2262.41 |
cutensor/elementwise_binary (3) | 44.52 | 0.01 | 0.0 | 8.23 | 445.00 | 1.39 | 3.1 | 3439.36 | 2847.83 |
device/ldg (4) | 8.14 | 0.00 | 0.0 | 0.00 | 171.00 | 0.36 | 4.4 | 638.09 | 2262.41 |
forwarddiff (10) | 122.47 | 0.06 | 0.1 | 0.00 | 171.00 | 1.79 | 1.5 | 4438.47 | 2035.12 |
gpuarrays/math (4) | 4.12 | 0.03 | 0.7 | 0.00 | 171.00 | 0.21 | 5.1 | 417.37 | 2262.41 |
texture (6) | 75.96 | 0.00 | 0.0 | 0.09 | 185.00 | 2.73 | 3.6 | 5799.91 | 2720.47 |
exceptions (2) | 127.47 | 0.00 | 0.0 | 0.00 | 151.00 | 0.02 | 0.0 | 26.31 | 2035.12 |
gpuarrays/input output (4) | 3.53 | 0.00 | 0.0 | 0.00 | 153.00 | 0.22 | 6.2 | 395.43 | 2262.41 |
cutensor/reductions (8) | 33.63 | 0.07 | 0.2 | 36.68 | 431.00 | 0.98 | 2.9 | 2415.27 | 3161.07 |
gpuarrays/indexing scalar (10) | 11.99 | 0.00 | 0.0 | 0.00 | 171.00 | 0.36 | 3.0 | 929.95 | 2035.12 |
gpuarrays/iterator constructors (4) | 6.37 | 0.00 | 0.0 | 0.02 | 171.00 | 0.50 | 7.9 | 455.46 | 2262.41 |
cutensor/elementwise_trinary (12) | 57.01 | 0.00 | 0.0 | 3.66 | 435.00 | 1.92 | 3.4 | 4744.51 | 2550.92 |
gpuarrays/value constructors (6) | 14.02 | 0.00 | 0.0 | 0.00 | 179.00 | 0.51 | 3.6 | 1196.03 | 2720.47 |
gpuarrays/conversions (4) | 4.87 | 0.00 | 0.0 | 0.01 | 153.00 | 0.19 | 4.0 | 416.56 | 2262.41 |
gpuarrays/constructors (12) | 2.73 | 0.00 | 0.2 | 0.03 | 153.00 | 0.03 | 1.0 | 112.42 | 2550.92 |
gpuarrays/uniformscaling (8) | 11.48 | 0.00 | 0.0 | 0.01 | 171.00 | 0.36 | 3.1 | 923.70 | 3161.07 |
cublas (7) | 152.59 | 0.15 | 0.1 | 17.67 | 459.00 | 5.25 | 3.4 | 12274.17 | 2285.89 |
gpuarrays/interface (14) | 18.37 | 0.03 | 0.2 | 0.00 | 171.00 | 0.83 | 4.5 | 1415.08 | 2043.51 |
gpuarrays/base (4) | 20.29 | 0.00 | 0.0 | 17.44 | 203.00 | 0.80 | 3.9 | 2010.91 | 2262.41 |
gpuarrays/random (6) | 23.06 | 0.00 | 0.0 | 0.03 | 183.00 | 0.67 | 2.9 | 1652.88 | 2720.47 |
device/wmma (3) | 51.56 | 0.00 | 0.0 | 0.38 | 181.00 | 1.22 | 2.4 | 3446.19 | 2847.83 |
gpuarrays/indexing multidimensional (2) | 44.74 | 0.07 | 0.1 | 0.69 | 173.00 | 1.54 | 3.4 | 3544.83 | 2035.12 |
cutensor/contractions (9) | 108.95 | 0.04 | 0.0 | 32035.86 | 515.00 | 5.06 | 4.6 | 14151.92 | 2377.03 |
examples (13) | 213.47 | 0.00 | 0.0 | 0.00 | 151.00 | 0.23 | 0.1 | 488.80 | 2035.12 |
gpuarrays/broadcasting (8) | 85.17 | 0.01 | 0.0 | 1.19 | 173.00 | 3.41 | 4.0 | 8432.55 | 3161.07 |
device/intrinsics (5) | 130.33 | 0.00 | 0.0 | 0.01 | 951.00 | 3.70 | 2.8 | 9407.37 | 2585.50 |
gpuarrays/linear algebra (10) | 106.29 | 0.01 | 0.0 | 5.24 | 439.00 | 4.00 | 3.8 | 7773.97 | 2377.88 |
gpuarrays/mapreduce essentials (12) | 135.77 | 0.01 | 0.0 | 3.19 | 175.00 | 5.10 | 3.8 | 13553.26 | 2550.92 |
gpuarrays/mapreduce derivatives (7) | 220.65 | 0.01 | 0.0 | 3.06 | 177.00 | 8.54 | 3.9 | 19128.46 | 2333.92 |
Worker 11 failed running test cusolver:
Some tests did not pass: 1491 passed, 1 failed, 0 errored, 0 broken.
cusolver: Test Failed at /home/nima/.julia/packages/CUDA/0p5fn/test/cusolver.jl:375
Expression: collect(d_U’ * d_A * d_V) ≈ U’ * A * V
Evaluated: Float32[6.602118 0.000104010105 … -0.00038917363 8.165836f-6; -0.00054194033 1.6551287 … -0.00010341406 4.1127205f-6; … ; -1.12988055f-5 3.300421f-5 … -0.000169307 0.00014225394; -1.7579645f-5 -0.00014031306 … 0.00013932213 -4.0720217f-5] ≈ Float32[6.601884 -1.2717834f-7 … -3.9752106f-8 -3.9203456f-7; -5.928844f-7 1.655306 … -3.2832887f-8 -2.7688507f-8; … ; -4.5828622f-7 -3.089253f-8 … -7.197736f-8 -1.1255685f-7; 1.8005375f-7 9.385222f-8 … -2.198069f-9 4.876814f-8]
Stacktrace:
[1] record(::Test.DefaultTestSet, ::Union{Test.Error, Test.Fail}) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:775
[2] top-level scope at /home/nima/.julia/packages/CUDA/0p5fn/test/runtests.jl:452
[3] include(::String) at ./client.jl:457
[4] top-level scope at none:6
[5] eval(::Module, ::Any) at ./boot.jl:331
[6] exec_options(::Base.JLOptions) at ./client.jl:272
[7] _start() at ./client.jl:506

Test Summary: | Pass Fail Broken Total
Overall | 12745 1 5 12751
initialization | 25 25
apiutils | 15 15
curand | 1 1
codegen | 9 9
iterator | 30 30
nnlib | 4 4
nvml | 7 7
nvtx | No tests
pointer | 35 35
pool | 10 10
broadcast | 29 29
random | 101 101
cufft | 175 175
threading | No tests
execution | 66 66
cudadrv/context | 12 12
statistics | 18 18
utils | 5 5
cudadrv/devices | 6 6
cudadrv/events | 6 6
cudadrv/errors | 6 6
cudadrv/module | 12 12
cudadrv/execution | 15 15
cudadrv/occupancy | 1 1
cudadrv/profile | 2 2
cudadrv/stream | 7 7
cudadrv/version | 3 3
cudadrv/memory | 49 1 50
cutensor/base | 8 8
cusparse | 497 497
array | 180 180
cudnn | 147 147
cusolver/cusparse | 84 84
device/array | 18 18
cusolver | 1491 1 1492
cutensor/permutations | 80 80
cutensor/elementwise_binary | 260 260
device/ldg | 21 21
forwarddiff | 107 107
gpuarrays/math | 8 8
texture | 38 4 42
exceptions | 17 17
gpuarrays/input output | 5 5
cutensor/reductions | 280 280
gpuarrays/indexing scalar | 249 249
gpuarrays/iterator constructors | 24 24
cutensor/elementwise_trinary | 340 340
gpuarrays/value constructors | 36 36
gpuarrays/conversions | 72 72
gpuarrays/constructors | 335 335
gpuarrays/uniformscaling | 56 56
cublas | 1920 1920
gpuarrays/interface | 7 7
gpuarrays/base | 39 39
gpuarrays/random | 46 46
device/wmma | 210 210
gpuarrays/indexing multidimensional | 34 34
cutensor/contractions | 3321 3321
examples | 7 7
gpuarrays/broadcasting | 155 155
device/intrinsics | 266 266
gpuarrays/linear algebra | 389 389
gpuarrays/mapreduce essentials | 522 522
gpuarrays/mapreduce derivatives | 827 827
FAILURE

Error in testset cusolver:
Test Failed at /home/nima/.julia/packages/CUDA/0p5fn/test/cusolver.jl:375
Expression: collect(d_U’ * d_A * d_V) ≈ U’ * A * V
Evaluated: Float32[6.602118 0.000104010105 … -0.00038917363 8.165836f-6; -0.00054194033 1.6551287 … -0.00010341406 4.1127205f-6; … ; -1.12988055f-5 3.300421f-5 … -0.000169307 0.00014225394; -1.7579645f-5 -0.00014031306 … 0.00013932213 -4.0720217f-5] ≈ Float32[6.601884 -1.2717834f-7 … -3.9752106f-8 -3.9203456f-7; -5.928844f-7 1.655306 … -3.2832887f-8 -2.7688507f-8; … ; -4.5828622f-7 -3.089253f-8 … -7.197736f-8 -1.1255685f-7; 1.8005375f-7 9.385222f-8 … -2.198069f-9 4.876814f-8]
ERROR: LoadError: Test run finished with errors
in expression starting at /home/nima/.julia/packages/CUDA/0p5fn/test/runtests.jl:483
ERROR: Package CUDA errored during testing

and here is Flux test/error

(gpu) pkg> add Flux
Resolving package versions…

Installed Richardson ────────── v1.2.0
Installed ArrayLayouts ──────── v0.4.10
Installed ChainRulesTestUtils ─ v0.5.3
Installed FiniteDifferences ─── v0.11.2
Installed ChainRules ────────── v0.7.32
Installed ChainRulesCore ────── v0.9.17
Installed Zygote ────────────── v0.5.9
Installed Flux ──────────────── v0.11.2
Installed FillArrays ────────── v0.9.7
Updating ~/Julia/gpu/Project.toml
[587475ba] + Flux v0.11.2
Updating ~/Julia/gpu/Manifest.toml
[1520ce14] + AbstractTrees v0.3.3
[4c555306] + ArrayLayouts v0.4.10
[56f22d72] + Artifacts v1.3.0
[082447d4] + ChainRules v0.7.32
[d360d2e6] + ChainRulesCore v0.9.17
[cdddcdb0] + ChainRulesTestUtils v0.5.3
[944b1d66] + CodecZlib v0.7.0
[3da002f7] + ColorTypes v0.10.9
[5ae59095] + Colors v0.12.4
[bbf7d656] + CommonSubexpressions v0.3.0
[e66e0078] + CompilerSupportLibraries_jll v0.3.4+0
[adafc99b] + CpuId v0.2.2
[9a962f9c] + DataAPI v1.4.0
[163ba53b] + DiffResults v1.0.2
[b552c78f] + DiffRules v1.0.1
[ffbed154] + DocStringExtensions v0.8.3
[1a297f60] + FillArrays v0.9.7
[26cc04aa] + FiniteDifferences v0.11.2
[53c48c17] + FixedPointNumbers v0.8.4
[587475ba] + Flux v0.11.2
[f6369f11] + ForwardDiff v0.10.12
[d9f16b24] + Functors v0.1.0
[7869d1d1] + IRTools v0.4.1
[692b3bcd] + JLLWrappers v1.1.3
[e5e0dc1b] + Juno v0.8.4
[bdcacae8] + LoopVectorization v0.8.26
[e89f7d12] + Media v0.5.0
[e1d29d7a] + Missings v0.4.4
[46d2c3a1] + MuladdMacro v0.2.2
[77ba4419] + NaNMath v0.3.4
[6fe1bfb0] + OffsetArrays v1.4.0
[efe28fd5] + OpenSpecFun_jll v0.5.3+4
[708f8203] + Richardson v1.2.0
[21efa798] + SIMDPirates v0.8.25
[476501e8] + SLEEFPirates v0.5.5
[a2af1166] + SortingAlgorithms v0.3.1
[276daf66] + SpecialFunctions v0.10.3
[90137ffa] + StaticArrays v0.12.5
[2913bbd2] + StatsBase v0.33.2
[3bb67fe8] + TranscodingStreams v0.9.5
[3a884ed6] + UnPack v1.0.2
[3d5dd08c] + VectorizationBase v0.12.33
[a5390f91] + ZipFile v0.9.3
[83775a58] + Zlib_jll v1.2.11+18
[e88e6eb3] + Zygote v0.5.9
[700de1a5] + ZygoteRules v0.2.0
[9fa8497b] + Future
[9abbd945] + Profile

(gpu) pkg>

(gpu) pkg> test Flux
Testing Flux
Status /tmp/jl_TZZjQw/Project.toml
[1520ce14] AbstractTrees v0.3.3
[79e6a3ab] Adapt v2.3.0
[052768ef] CUDA v2.1.0
[944b1d66] CodecZlib v0.7.0
[5ae59095] Colors v0.12.4
[e30172f5] Documenter v0.25.3
[587475ba] Flux v0.11.2
[d9f16b24] Functors v0.1.0
[c8e1da08] IterTools v1.3.0
[e5e0dc1b] Juno v0.8.4
[1914dd2f] MacroTools v0.5.6
[872c559c] NNlib v0.7.6
[189a3867] Reexport v0.2.0
[2913bbd2] StatsBase v0.33.2
[a5390f91] ZipFile v0.9.3
[e88e6eb3] Zygote v0.5.9
[8bb1440f] DelimitedFiles
[37e2e46d] LinearAlgebra
[44cfe95a] Pkg
[de0858da] Printf
[9a3f8284] Random
[ea8e919c] SHA
[10745b16] Statistics
[8dfed614] Test
Status /tmp/jl_TZZjQw/Manifest.toml
[621f4979] AbstractFFTs v0.5.0
[1520ce14] AbstractTrees v0.3.3
[79e6a3ab] Adapt v2.3.0
[4c555306] ArrayLayouts v0.4.10
[56f22d72] Artifacts v1.3.0
[ab4f0b2a] BFloat16s v0.1.0
[fa961155] CEnum v0.4.1
[052768ef] CUDA v2.1.0
[082447d4] ChainRules v0.7.32
[d360d2e6] ChainRulesCore v0.9.17
[cdddcdb0] ChainRulesTestUtils v0.5.3
[944b1d66] CodecZlib v0.7.0
[3da002f7] ColorTypes v0.10.9
[5ae59095] Colors v0.12.4
[bbf7d656] CommonSubexpressions v0.3.0
[34da2185] Compat v3.23.0
[e66e0078] CompilerSupportLibraries_jll v0.3.4+0
[adafc99b] CpuId v0.2.2
[9a962f9c] DataAPI v1.4.0
[864edb3b] DataStructures v0.18.8
[163ba53b] DiffResults v1.0.2
[b552c78f] DiffRules v1.0.1
[ffbed154] DocStringExtensions v0.8.3
[e30172f5] Documenter v0.25.3
[e2ba6199] ExprTools v0.1.3
[1a297f60] FillArrays v0.9.7
[26cc04aa] FiniteDifferences v0.11.2
[53c48c17] FixedPointNumbers v0.8.4
[587475ba] Flux v0.11.2
[f6369f11] ForwardDiff v0.10.12
[d9f16b24] Functors v0.1.0
[0c68f7d7] GPUArrays v6.1.1
[61eb1bfa] GPUCompiler v0.8.3
[b5f81e59] IOCapture v0.1.1
[7869d1d1] IRTools v0.4.1
[c8e1da08] IterTools v1.3.0
[692b3bcd] JLLWrappers v1.1.3
[682c06a0] JSON v0.21.1
[e5e0dc1b] Juno v0.8.4
[929cbde3] LLVM v3.3.0
[bdcacae8] LoopVectorization v0.8.26
[1914dd2f] MacroTools v0.5.6
[e89f7d12] Media v0.5.0
[e1d29d7a] Missings v0.4.4
[46d2c3a1] MuladdMacro v0.2.2
[872c559c] NNlib v0.7.6
[77ba4419] NaNMath v0.3.4
[6fe1bfb0] OffsetArrays v1.4.0
[efe28fd5] OpenSpecFun_jll v0.5.3+4
[bac558e1] OrderedCollections v1.3.2
[69de0a69] Parsers v1.0.11
[189a3867] Reexport v0.2.0
[ae029012] Requires v1.1.0
[708f8203] Richardson v1.2.0
[21efa798] SIMDPirates v0.8.25
[476501e8] SLEEFPirates v0.5.5
[6c6a2e73] Scratch v1.0.3
[a2af1166] SortingAlgorithms v0.3.1
[276daf66] SpecialFunctions v0.10.3
[90137ffa] StaticArrays v0.12.5
[2913bbd2] StatsBase v0.33.2
[a759f4b9] TimerOutputs v0.5.7
[3bb67fe8] TranscodingStreams v0.9.5
[3a884ed6] UnPack v1.0.2
[3d5dd08c] VectorizationBase v0.12.33
[a5390f91] ZipFile v0.9.3
[83775a58] Zlib_jll v1.2.11+18
[e88e6eb3] Zygote v0.5.9
[700de1a5] ZygoteRules v0.2.0
[2a0f44e3] Base64
[ade2ca70] Dates
[8bb1440f] DelimitedFiles
[8ba89e20] Distributed
[9fa8497b] Future
[b77e0a4c] InteractiveUtils
[76f85450] LibGit2
[8f399da3] Libdl
[37e2e46d] LinearAlgebra
[56ddb016] Logging
[d6f4376e] Markdown
[a63ad114] Mmap
[44cfe95a] Pkg
[de0858da] Printf
[9abbd945] Profile
[3fa0cd96] REPL
[9a3f8284] Random
[ea8e919c] SHA
[9e88b42a] Serialization
[1a1011a3] SharedArrays
[6462fe0b] Sockets
[2f01184e] SparseArrays
[10745b16] Statistics
[8dfed614] Test
[cf7118a7] UUIDs
[4ec0a83e] Unicode
Test Summary: | Pass Total
Utils | 60 60
Test Summary: | Pass Total
Onehot | 9 9
Test Summary: | Pass Total
Optimise | 27 27
[ Info: Downloading CMUDict dataset
[ Info: Downloading MNIST dataset
[ Info: Downloading MNIST dataset
[ Info: Downloading MNIST dataset
[ Info: Downloading MNIST dataset
[ Info: Downloading Fashion-MNIST dataset
[ Info: Downloading Fashion-MNIST dataset
[ Info: Downloading Fashion-MNIST dataset
[ Info: Downloading Fashion-MNIST dataset
[ Info: Downloading sentiment treebank dataset
[ Info: Downloading iris dataset.
[ Info: Downloading the Boston housing Dataset
Test Summary: | Pass Total
Data | 54 54
Test Summary: | Pass Total
Losses | 88 88
┌ Warning: Slow fallback implementation invoked for conv! You probably don’t want this; check your datatypes.
│ yT = Float64
│ T1 = Float32
│ T2 = Float64
└ @ NNlib ~/.julia/packages/NNlib/Z9qZP/src/conv.jl:206
Test Summary: | Pass Total
Layers | 219 219
[ Info: Testing GPU Support
[ Info: Testing Flux/CUDNN
R = RNN, batch_size = 5: Test Failed at /home/nima/.julia/packages/Flux/q3zeA/test/cuda/curnn.jl:38
Expression: x̄ ≈ collect(cux̄)
Evaluated: [-1.2563565767280167 -0.32043449857558765 … 0.6598185823329246 -0.8743064105599022; 0.4139337826036863 -0.0489961935370902 … 0.7104068283356215 -0.21005488704762065; … ; -1.1329831875497842 0.09261050769960308 … -0.36696239598997715 0.11156129322481846; -0.0898901569255448 -0.3161775390020851 … 0.21676706154085668 -0.09266956050321833] ≈ Float32[-1.2561821 -0.3207922 … 0.6599076 -0.87420034; 0.4134525 -0.04903677 … 0.71047 -0.20995572; … ; -1.1331674 0.09284677 … -0.3665883 0.11111802; -0.090046406 -0.31599757 … 0.2170124 -0.09270978]
Stacktrace:
[1] top-level scope at /home/nima/.julia/packages/Flux/q3zeA/test/cuda/curnn.jl:38
[2] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1190
[3] top-level scope at /home/nima/.julia/packages/Flux/q3zeA/test/cuda/curnn.jl:16
[4] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
[5] top-level scope at /home/nima/.julia/packages/Flux/q3zeA/test/cuda/curnn.jl:16
R = GRU, batch_size = 1: Test Failed at /home/nima/.julia/packages/Flux/q3zeA/test/cuda/curnn.jl:38
Expression: x̄ ≈ collect(cux̄)
Evaluated: [-0.22776741748701154, 0.3209347956882381, 0.10693373763558503, -0.41107076672434856, -0.32476707616330297, 0.352650092060774, -0.04047323530121069, 0.1934077904424144, -0.2822926974842711, 0.19332311343347433] ≈ Float32[-0.22774479, 0.32102972, 0.10696593, -0.41122618, -0.32493117, 0.35273764, -0.040478405, 0.19339433, -0.28233784, 0.1934645]
Stacktrace:
[1] top-level scope at /home/nima/.julia/packages/Flux/q3zeA/test/cuda/curnn.jl:38
[2] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1190
[3] top-level scope at /home/nima/.julia/packages/Flux/q3zeA/test/cuda/curnn.jl:16
[4] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
[5] top-level scope at /home/nima/.julia/packages/Flux/q3zeA/test/cuda/curnn.jl:16
R = LSTM, batch_size = 5: Test Failed at /home/nima/.julia/packages/Flux/q3zeA/test/cuda/curnn.jl:38
Expression: x̄ ≈ collect(cux̄)
Evaluated: [-0.10268096525608544 0.00994827058360617 … 0.16947322399116943 -0.20729710678265328; 0.08206973619000305 -0.01609505278780965 … -0.12169657342862913 0.10296755973293963; … ; 0.05812821623121952 0.16542800217809267 … -0.1275825994643608 0.10090241858739234; -0.030771661228154303 0.0030305073944870637 … -0.04548004543916577 0.12628198308549335] ≈ Float32[-0.10267152 0.00988317 … 0.169493 -0.20725344; 0.082046695 -0.01610528 … -0.12171983 0.102957785; … ; 0.058113117 0.16547728 … -0.12760134 0.100878865; -0.030738413 0.0030465734 … -0.045501728 0.1262683]
Stacktrace:
[1] top-level scope at /home/nima/.julia/packages/Flux/q3zeA/test/cuda/curnn.jl:38
[2] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1190
[3] top-level scope at /home/nima/.julia/packages/Flux/q3zeA/test/cuda/curnn.jl:16
[4] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
[5] top-level scope at /home/nima/.julia/packages/Flux/q3zeA/test/cuda/curnn.jl:16
Test Summary: | Pass Fail Broken Total
CUDA | 110 3 35 148
CUDA | 9 9
onecold gpu | 2 2
restructure gpu | 1 1
GPU functors | 2 2
Losses | 44 1 45
Basic GPU Movement | 2 2
Conv GPU grad tests | 6 1 7
Pooling GPU grad tests | 2 2
AdaptivePooling GPU grad tests | 2 2
Dropout GPU grad tests | 1 1 2
Normalising GPU grad tests | 4 4
InstanceNorm GPU grad tests | 1 1
GroupNorm GPU grad tests | 1 1
Stateless GPU grad tests | 1 1
CUDNN BatchNorm | 8 8
R = RNN | 1 2 3
R = GRU | 1 2 3
R = LSTM | 1 2 3
RNN | 23 3 24 50
R = RNN, batch_size = 1 | 4 4 8
R = RNN, batch_size = 5 | 3 1 4 8
R = GRU, batch_size = 1 | 3 1 4 8
R = GRU, batch_size = 5 | 4 4 8
R = LSTM, batch_size = 1 | 5 4 9
R = LSTM, batch_size = 5 | 4 1 4 9
ERROR: LoadError: Some tests did not pass: 110 passed, 3 failed, 0 errored, 35 broken.
in expression starting at /home/nima/.julia/packages/Flux/q3zeA/test/runtests.jl:37
ERROR: Package Flux errored during testing

Glad it now worked! Regarding Flux, the RNN errors should have been fixed in current master. Can’t tell however about the CUDA cusolver errors.