Request for testing: CUDA from BinaryBuilder

Hi all,

We are looking into using BinaryBuilder for installing CUDA when you use the Julia/CUDA stack, i.e. CuArrays or CUDAnative. The support for this has been merged to the master branches of the respective repositories, and it would be great to get some feedback from real users. If you use any of these packages and have a while, please execute this:

] add CUDAnative#master CuArrays#master GPUArrays#master
] test CUDAnative CuArrays

For those of you who want to use a local CUDA installation, run with JULIA_CUDA_USE_BINARYBUILDER=false.

If any of this fails due to anything CUDA installation-related, let us know here. Please run with JULIA_DEBUG=all when doing so, or when you want to confirm artifacts are or aren’t used. Feel free to file issues for other problems though :slight_smile:

10 Likes

This is awesome for lazy people like me!

[alir@TARTARUS ~]$ JULIA_DEBUG=all julia

(v1.3) pkg> st
    Status `~/.julia/environments/v1.3/Project.toml`
  [be33ccc6] CUDAnative v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [3a865a2d] CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [0c68f7d7] GPUArrays v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)

All tests passed!

Test Summary: | Pass  Total
CUDAnative    |  528    528
   Testing CUDAnative tests passed

Test Summary: | Pass  Broken  Total
CuArrays      | 6217       1   6218
   Testing CuArrays tests passed 

This was on a computer with 4x TITAN V.

1 Like

Artifacts are fantastic to keep things running reliably.

julia> versioninfo()
Julia Version 1.4.0-rc1.0
Commit b0c33b0cf5 (2020-01-23 17:23 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-6800K CPU @ 3.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, broadwell)
Environment:
  JULIA_EDITOR = vim

Are errors like this expected for non-root user?

Downloading artifact: CUDNN+CUDA10.1
##O=#  #
Downloading artifact: CUDNN+CUDA10.1
Downloading artifact: CUTENSOR+CUDA10.1
##O#- #
Downloading artifact: CUTENSOR+CUDA10.1
[ Info: Testing using device GeForce GTX 1080 Ti (compute capability 6.1.0, 10.278 GiB available memory) on CUDA driver 10.1.0 and toolkit 10.1.243
[ Info: Building the CUDAnative run-time library for your sm_61 device, this might take a while...
basic reflection: Error During Test at /home/jc/.julia/packages/CUDAnative/1UYFF/test/device/codegen.jl:109
  Test threw exception
  Expression: CUDAnative.code_sass(devnull, valid_kernel, Tuple{}) == nothing
  CUPTIError: user doesn't have sufficient privileges which are required to start the profiling session (code 35, CUPTI_ERROR_INSUFFICIENT_PRIVILEGES)
  Stacktrace:
   [1] throw_api_error(::CUDAnative.CUPTI.CUptiResult) at /home/jc/.julia/packages/CUDAnative/1UYFF/src/cupti/error.jl:117
   [2] macro expansion at /home/jc/.julia/packages/CUDAnative/1UYFF/src/cupti/error.jl:130 [inlined]
   [3] cuptiSubscribe at /home/jc/.julia/packages/CUDAnative/1UYFF/src/cupti/libcupti.jl:197 [inlined]
   [4] code_sass(::Base.DevNull, ::CUDAnative.CompilerJob; verbose::Bool) at /home/jc/.julia/packages/CUDAnative/1UYFF/src/reflection.jl:151
   [5] code_sass(::Base.DevNull, ::Any, ::Any; cap::VersionNumber, kernel::Bool, verbose::Bool, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/jc/.julia/packages/CUDAnative/1UYFF/src/reflection.jl:136
   [6] code_sass(::Base.DevNull, ::Any, ::Any) at /home/jc/.julia/packages/CUDAnative/1UYFF/src/reflection.jl:134
   [7] top-level scope at /home/jc/.julia/packages/CUDAnative/1UYFF/test/device/codegen.jl:109
   [8] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/Test/src/Test.jl:1113
   [9] top-level scope at /home/jc/.julia/packages/CUDAnative/1UYFF/test/device/codegen.jl:106
   [10] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/Test/src/Test.jl:1113
   [11] top-level scope at /home/jc/.julia/packages/CUDAnative/1UYFF/test/device/codegen.jl:105
   [12] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/Test/src/Test.jl:1113
   [13] top-level scope at /home/jc/.julia/packages/CUDAnative/1UYFF/test/device/codegen.jl:5

Also, when running with root user:

[ Info: Testing using device GeForce GTX 1080 Ti (compute capability 6.1.0, 10.770 GiB available memory) on CUDA driver 10.1.0 and toolkit 10.1.243
β”Œ Warning: calls to Base intrinsics might be GPU incompatible
β”‚   exception =
β”‚    You called hypot(x::T, y::T) where T<:AbstractFloat in Base.Math at math.jl:619, maybe you intended to call hypot(x::Float32, y::Float32) in CUDAnative at /home/jc/.julia/packages/CUDAnative/1UYFF/src/device/cuda/math.jl:311 instead?
β”‚    Stacktrace:
β”‚     [1] hypot at math.jl:619
β”‚     [2] reduce_kernel at /home/jc/.julia/packages/GPUArrays/QYxut/src/host/mapreduce.jl:134
β”” @ CUDAnative ~/.julia/packages/CUDAnative/1UYFF/src/compiler/irgen.jl:111
β”Œ Warning: calls to Base intrinsics might be GPU incompatible
β”‚   exception =
β”‚    You called hypot(x::T, y::T) where T<:AbstractFloat in Base.Math at math.jl:619, maybe you intended to call hypot(x::Float32, y::Float32) in CUDAnative at /home/jc/.julia/packages/CUDAnative/1UYFF/src/device/cuda/math.jl:311 instead?
β”‚    Stacktrace:
β”‚     [1] hypot at math.jl:619
β”‚     [2] reduce_kernel at /home/jc/.julia/packages/GPUArrays/QYxut/src/host/mapreduce.jl:134
β”” @ CUDAnative ~/.julia/packages/CUDAnative/1UYFF/src/compiler/irgen.jl:111
β”Œ Warning: calls to Base intrinsics might be GPU incompatible
β”‚   exception =
β”‚    You called hypot(x::T, y::T) where T<:AbstractFloat in Base.Math at math.jl:619, maybe you intended to call hypot(x::Float64, y::Float64) in CUDAnative at /home/jc/.julia/packages/CUDAnative/1UYFF/src/device/cuda/math.jl:310 instead?
β”‚    Stacktrace:
β”‚     [1] hypot at math.jl:619
β”‚     [2] reduce_kernel at /home/jc/.julia/packages/GPUArrays/QYxut/src/host/mapreduce.jl:134
β”” @ CUDAnative ~/.julia/packages/CUDAnative/1UYFF/src/compiler/irgen.jl:111
β”Œ Warning: calls to Base intrinsics might be GPU incompatible
β”‚   exception =
β”‚    You called hypot(x::T, y::T) where T<:AbstractFloat in Base.Math at math.jl:619, maybe you intended to call hypot(x::Float64, y::Float64) in CUDAnative at /home/jc/.julia/packages/CUDAnative/1UYFF/src/device/cuda/math.jl:310 instead?
β”‚    Stacktrace:
β”‚     [1] hypot at math.jl:619
β”‚     [2] reduce_kernel at /home/jc/.julia/packages/GPUArrays/QYxut/src/host/mapreduce.jl:134
β”” @ CUDAnative ~/.julia/packages/CUDAnative/1UYFF/src/compiler/irgen.jl:111
[ Info: Testing CUDNN 7.6.5
β”Œ Warning: Not testing CUTENSOR
β”” @ Main ~/.julia/packages/CuArrays/lLQny/test/tensor.jl:7
[ Info: Testing ForwardDiff integration
Test Summary: | Pass  Total
CuArrays      | 4967   4967
    Testing CuArrays tests passed

Yes, NVIDIA has disabled profiling for non-root users: https://developer.nvidia.com/nvidia-development-tools-solutions-err-nvgpuctrperm-nvprof

The other warnings are expected for now.

1 Like

The build works fine on our Cray system:

(testcu) pkg> status
    Status `~/testcu/Project.toml`
  (empty environment)
(testcu) pkg> add CUDAnative#master CuArrays#master GPUArrays#master
  Updating registry at `~/.julia/1.3.1/daint-mc/registries/General`
  Updating git-repo `https://github.com/JuliaRegistries/General.git`
  Updating git-repo `https://github.com/JuliaGPU/CUDAnative.jl.git`
  Updating git-repo `https://github.com/JuliaGPU/CuArrays.jl.git`
  Updating git-repo `https://github.com/JuliaGPU/GPUArrays.jl.git`
 Resolving package versions...
  Updating `~/testcu/Project.toml`
  [be33ccc6] + CUDAnative v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [3a865a2d] + CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [0c68f7d7] + GPUArrays v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)
  Updating `~/testcu/Manifest.toml`
  [621f4979] + AbstractFFTs v0.5.0
  [79e6a3ab] + Adapt v1.0.1
  [b99e7846] + BinaryProvider v0.5.8
  [fa961155] + CEnum v0.2.0
  [3895d2a7] + CUDAapi v3.1.0
  [c5f51814] + CUDAdrv v6.0.0
  [be33ccc6] + CUDAnative v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [3a865a2d] + CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [864edb3b] + DataStructures v0.17.9
  [0c68f7d7] + GPUArrays v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)
  [929cbde3] + LLVM v1.3.3
  [1914dd2f] + MacroTools v0.5.4
  [872c559c] + NNlib v0.6.4
  [bac558e1] + OrderedCollections v1.1.0
  [189a3867] + Reexport v0.2.0
  [ae029012] + Requires v1.0.1
  [a759f4b9] + TimerOutputs v0.5.3
  [2a0f44e3] + Base64 
  [ade2ca70] + Dates 
  [8ba89e20] + Distributed 
  [b77e0a4c] + InteractiveUtils 
  [76f85450] + LibGit2 
  [8f399da3] + Libdl 
  [37e2e46d] + LinearAlgebra 
  [56ddb016] + Logging 
  [d6f4376e] + Markdown 
  [44cfe95a] + Pkg 
  [de0858da] + Printf 
  [3fa0cd96] + REPL 
  [9a3f8284] + Random 
  [ea8e919c] + SHA 
  [9e88b42a] + Serialization 
  [6462fe0b] + Sockets 
  [2f01184e] + SparseArrays 
  [10745b16] + Statistics 
  [8dfed614] + Test 
  [cf7118a7] + UUIDs 
  [4ec0a83e] + Unicode 

(testcu) pkg> status
    Status `~/testcu/Project.toml`
  [be33ccc6] CUDAnative v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [3a865a2d] CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [0c68f7d7] GPUArrays v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)

(testcu) pkg> build CUDAnative

(testcu) pkg> build CuArrays
  Building NNlib β†’ `~/.julia/1.3.1/daint-mc/packages/NNlib/3krvM/deps/build.log`

(testcu) pkg> build GPUArrays

julia> 

Unfortunately, the tests give some errors (I run it from a compute node with 1 GPU):

(testcu) pkg> test CUDAnative
   Testing CUDAnative
 Resolving package versions...
    Status `/tmp/jl_hhbj5A/Manifest.toml`
  [621f4979] AbstractFFTs v0.5.0
  [79e6a3ab] Adapt v1.0.1
  [b99e7846] BinaryProvider v0.5.8
  [fa961155] CEnum v0.2.0
  [3895d2a7] CUDAapi v3.1.0
  [c5f51814] CUDAdrv v6.0.0
  [be33ccc6] CUDAnative v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [3a865a2d] CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [864edb3b] DataStructures v0.17.9
  [0c68f7d7] GPUArrays v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)
  [929cbde3] LLVM v1.3.3
  [1914dd2f] MacroTools v0.5.4
  [872c559c] NNlib v0.6.4
  [bac558e1] OrderedCollections v1.1.0
  [189a3867] Reexport v0.2.0
  [ae029012] Requires v1.0.1
  [a759f4b9] TimerOutputs v0.5.3
  [2a0f44e3] Base64  [`@stdlib/Base64`]
  [ade2ca70] Dates  [`@stdlib/Dates`]
  [8ba89e20] Distributed  [`@stdlib/Distributed`]
  [b77e0a4c] InteractiveUtils  [`@stdlib/InteractiveUtils`]
  [76f85450] LibGit2  [`@stdlib/LibGit2`]
  [8f399da3] Libdl  [`@stdlib/Libdl`]
  [37e2e46d] LinearAlgebra  [`@stdlib/LinearAlgebra`]
  [56ddb016] Logging  [`@stdlib/Logging`]
  [d6f4376e] Markdown  [`@stdlib/Markdown`]
  [44cfe95a] Pkg  [`@stdlib/Pkg`]
  [de0858da] Printf  [`@stdlib/Printf`]
  [3fa0cd96] REPL  [`@stdlib/REPL`]
  [9a3f8284] Random  [`@stdlib/Random`]
  [ea8e919c] SHA  [`@stdlib/SHA`]
  [9e88b42a] Serialization  [`@stdlib/Serialization`]
  [6462fe0b] Sockets  [`@stdlib/Sockets`]
  [2f01184e] SparseArrays  [`@stdlib/SparseArrays`]
  [10745b16] Statistics  [`@stdlib/Statistics`]
  [8dfed614] Test  [`@stdlib/Test`]
  [cf7118a7] UUIDs  [`@stdlib/UUIDs`]
  [4ec0a83e] Unicode  [`@stdlib/Unicode`]
β”Œ Debug: CUDA toolkit identified as 10.1.243
β”” @ CUDAapi ~/.julia/1.3.1/daint-mc/packages/CUDAapi/wYUAO/src/discovery.jl:297
β”Œ Debug: Using CUDA 10.1.243 from an artifact at /users/omlins/.julia/1.3.1/daint-mc/artifacts/f583fb4bf816d52e5ac27a2035f4d25768b0814c
β”” @ CUDAnative ~/.julia/1.3.1/daint-mc/packages/CUDAnative/1UYFF/src/bindeps.jl:174
β”Œ Debug: Toolchain with LLVM 6.0.1, CUDA driver 10.1.0 and toolkit 10.1.243 supports devices 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2 and 7.0; PTX 3.2, 4.0, 4.1, 4.2, 4.3, 5.0 and 6.0
β”” @ CUDAnative ~/.julia/1.3.1/daint-mc/packages/CUDAnative/1UYFF/src/bindeps.jl:197
β”Œ Debug: Rejecting stale cache file /users/omlins/.julia/1.3.1/daint-mc/compiled/v1.3/NNlib/A7zdE_yYdIU.ji (mtime 1.582133362888345e9) because file /users/omlins/.julia/1.3.1/daint-mc/packages/NNlib/3krvM/deps/deps.jl (mtime 1.582198971551903e9) has changed
β”” @ Base loading.jl:1474
β”Œ Debug: Required dependency NNlib [872c559c-99b0-510c-b3b7-b6c96a88d5cd] failed to load from cache file for /users/omlins/.julia/1.3.1/daint-mc/packages/NNlib/3krvM/src/NNlib.jl.
β”” @ Base loading.jl:767
β”Œ Debug: Precompiling CuArrays [3a865a2d-5b23-5a0f-bc46-62713ec82fae]
β”” @ Base loading.jl:1273
β”Œ Debug: CUDA toolkit identified as 10.1.243
β”” @ CUDAapi ~/.julia/1.3.1/daint-mc/packages/CUDAapi/wYUAO/src/discovery.jl:297
β”Œ Debug: Using CUDA 10.1.243 from an artifact at /users/omlins/.julia/1.3.1/daint-mc/artifacts/f583fb4bf816d52e5ac27a2035f4d25768b0814c
β”” @ CUDAnative ~/.julia/1.3.1/daint-mc/packages/CUDAnative/1UYFF/src/bindeps.jl:174
β”Œ Debug: Toolchain with LLVM 6.0.1, CUDA driver 10.1.0 and toolkit 10.1.243 supports devices 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2 and 7.0; PTX 3.2, 4.0, 4.1, 4.2, 4.3, 5.0 and 6.0
β”” @ CUDAnative ~/.julia/1.3.1/daint-mc/packages/CUDAnative/1UYFF/src/bindeps.jl:197
β”Œ Debug: Rejecting cache file /users/omlins/.julia/1.3.1/daint-mc/compiled/v1.3/Requires/IyxeS_R8TGG.ji because it is for file /users/omlins/.julia/1.3.1/daint-mc/packages/Requires/9Jse8/src/Requires.jl) not file /users/omlins/.julia/1.3.1/daint-mc/packages/Requires/qy6zC/src/Requires.jl
β”” @ Base loading.jl:1459
β”Œ Debug: Rejecting stale cache file /users/omlins/.julia/1.3.1/daint-mc/compiled/v1.3/NNlib/A7zdE_yYdIU.ji (mtime 1.582133362888345e9) because file /users/omlins/.julia/1.3.1/daint-mc/packages/NNlib/3krvM/deps/deps.jl (mtime 1.582198971551903e9) has changed
β”” @ Base loading.jl:1474
β”Œ Debug: Precompiling NNlib [872c559c-99b0-510c-b3b7-b6c96a88d5cd]
β”” @ Base loading.jl:1273
β”Œ Debug: Rejecting cache file /users/omlins/.julia/1.3.1/daint-mc/compiled/v1.3/Requires/IyxeS_R8TGG.ji because it provides the wrong uuid (got 4837871966183297) for mod (want 1823799888678804)
β”” @ Base loading.jl:1451
β”Œ Debug: Rejecting cache file /users/omlins/.julia/1.3.1/daint-mc/compiled/v1.3/Requires/IyxeS_R8TGG.ji because it is for file /users/omlins/.julia/1.3.1/daint-mc/packages/Requires/9Jse8/src/Requires.jl) not file /users/omlins/.julia/1.3.1/daint-mc/packages/Requires/qy6zC/src/Requires.jl
β”” @ Base loading.jl:1459
β”Œ Debug: Using CUDA 10.1.0 from an artifact at /users/omlins/.julia/1.3.1/daint-mc/artifacts/f583fb4bf816d52e5ac27a2035f4d25768b0814c
β”” @ CuArrays ~/.julia/1.3.1/daint-mc/packages/CuArrays/lLQny/src/bindeps.jl:76
β”Œ Debug: Using CUDNN from an artifact at /users/omlins/.julia/1.3.1/daint-mc/artifacts/71215498c41fadf9fd76244f5e08ab431fadbd04
β”” @ CuArrays ~/.julia/1.3.1/daint-mc/packages/CuArrays/lLQny/src/bindeps.jl:120
β”Œ Debug: Using CUTENSOR from an artifact at /users/omlins/.julia/1.3.1/daint-mc/artifacts/60b7126923636577370a136e91590e0f28d3147b
β”” @ CuArrays ~/.julia/1.3.1/daint-mc/packages/CuArrays/lLQny/src/bindeps.jl:146
β”Œ Debug: Initializing CUDA on thread 1
β”” @ CUDAnative ~/.julia/1.3.1/daint-mc/packages/CUDAnative/1UYFF/src/init.jl:35
β”Œ Debug: Initializing CUDA on thread 1
β”” @ CUDAnative ~/.julia/1.3.1/daint-mc/packages/CUDAnative/1UYFF/src/init.jl:35
CUDAnative: Error During Test at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAnative/1UYFF/test/runtests.jl:8
  Got exception outside of a @test
  CUDA error: invalid device ordinal (code 101, ERROR_INVALID_DEVICE)
  Stacktrace:
   [1] throw_api_error(::CUDAdrv.cudaError_enum) at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAdrv/b1mvw/src/error.jl:131
   [2] macro expansion at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAdrv/b1mvw/src/error.jl:144 [inlined]
   [3] cuCtxCreate_v2 at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAdrv/b1mvw/src/libcuda.jl:108 [inlined]
   [4] CuContext(::CuDevice, ::CUDAdrv.CUctx_flags_enum) at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAdrv/b1mvw/src/context.jl:73
   [5] CuContext at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAdrv/b1mvw/src/context.jl:72 [inlined]
   [6] CuContext(::var"#7#15", ::CuDevice) at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAdrv/b1mvw/src/context.jl:118
   [7] iterate at ./none:0 [inlined]
   [8] collect(::Base.Generator{CUDAdrv.DeviceSet,var"#6#14"}) at ./array.jl:622
   [9] top-level scope at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAnative/1UYFF/test/runtests.jl:56
   [10] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [11] top-level scope at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAnative/1UYFF/test/runtests.jl:10
   [12] include at ./boot.jl:328 [inlined]
   [13] include_relative(::Module, ::String) at ./loading.jl:1105
   [14] include(::Module, ::String) at ./Base.jl:31
   [15] include(::String) at ./client.jl:424
   [16] top-level scope at none:6
   [17] eval(::Module, ::Any) at ./boot.jl:330
   [18] exec_options(::Base.JLOptions) at ./client.jl:263
   [19] _start() at ./client.jl:460
  
Test Summary: | Pass  Error  Total
CUDAnative    |    7      1      8
ERROR: LoadError: Some tests did not pass: 7 passed, 0 failed, 1 errored, 0 broken.
in expression starting at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAnative/1UYFF/test/runtests.jl:8
ERROR: Package CUDAnative errored during testing

(testcu) pkg> test CuArrays
   Testing CuArrays
 Resolving package versions...
    Status `/tmp/jl_lodX5H/Manifest.toml`
  [621f4979] AbstractFFTs v0.5.0
  [79e6a3ab] Adapt v1.0.1
  [b99e7846] BinaryProvider v0.5.8
  [fa961155] CEnum v0.2.0
  [3895d2a7] CUDAapi v3.1.0
  [c5f51814] CUDAdrv v6.0.0
  [be33ccc6] CUDAnative v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [bbf7d656] CommonSubexpressions v0.2.0
  [3a865a2d] CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [864edb3b] DataStructures v0.17.9
  [163ba53b] DiffResults v1.0.2
  [b552c78f] DiffRules v1.0.1
  [7a1cc6ca] FFTW v1.2.0
  [f5851436] FFTW_jll v3.3.9+3
  [1a297f60] FillArrays v0.8.4
  [f6369f11] ForwardDiff v0.10.9
  [0c68f7d7] GPUArrays v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)
  [1d5cc7b8] IntelOpenMP_jll v2018.0.3+0
  [929cbde3] LLVM v1.3.3
  [856f044c] MKL_jll v2019.0.117+2
  [1914dd2f] MacroTools v0.5.4
  [872c559c] NNlib v0.6.4
  [77ba4419] NaNMath v0.3.3
  [efe28fd5] OpenSpecFun_jll v0.5.3+1
  [bac558e1] OrderedCollections v1.1.0
  [189a3867] Reexport v0.2.0
  [ae029012] Requires v1.0.1
  [276daf66] SpecialFunctions v0.10.0
  [90137ffa] StaticArrays v0.12.1
  [a759f4b9] TimerOutputs v0.5.3
  [2a0f44e3] Base64  [`@stdlib/Base64`]
  [ade2ca70] Dates  [`@stdlib/Dates`]
  [8ba89e20] Distributed  [`@stdlib/Distributed`]
  [b77e0a4c] InteractiveUtils  [`@stdlib/InteractiveUtils`]
  [76f85450] LibGit2  [`@stdlib/LibGit2`]
  [8f399da3] Libdl  [`@stdlib/Libdl`]
  [37e2e46d] LinearAlgebra  [`@stdlib/LinearAlgebra`]
  [56ddb016] Logging  [`@stdlib/Logging`]
  [d6f4376e] Markdown  [`@stdlib/Markdown`]
  [44cfe95a] Pkg  [`@stdlib/Pkg`]
  [de0858da] Printf  [`@stdlib/Printf`]
  [3fa0cd96] REPL  [`@stdlib/REPL`]
  [9a3f8284] Random  [`@stdlib/Random`]
  [ea8e919c] SHA  [`@stdlib/SHA`]
  [9e88b42a] Serialization  [`@stdlib/Serialization`]
  [6462fe0b] Sockets  [`@stdlib/Sockets`]
  [2f01184e] SparseArrays  [`@stdlib/SparseArrays`]
  [10745b16] Statistics  [`@stdlib/Statistics`]
  [8dfed614] Test  [`@stdlib/Test`]
  [cf7118a7] UUIDs  [`@stdlib/UUIDs`]
  [4ec0a83e] Unicode  [`@stdlib/Unicode`]
β”Œ Debug: Rejecting cache file /users/omlins/.julia/1.3.1/daint-mc/compiled/v1.3/Requires/IyxeS_R8TGG.ji because it is for file /users/omlins/.julia/1.3.1/daint-mc/packages/Requires/9Jse8/src/Requires.jl) not file /users/omlins/.julia/1.3.1/daint-mc/packages/Requires/qy6zC/src/Requires.jl
β”” @ Base loading.jl:1459
β”Œ Debug: CUDA toolkit identified as 10.1.243
β”” @ CUDAapi ~/.julia/1.3.1/daint-mc/packages/CUDAapi/wYUAO/src/discovery.jl:297
β”Œ Debug: Using CUDA 10.1.243 from an artifact at /users/omlins/.julia/1.3.1/daint-mc/artifacts/f583fb4bf816d52e5ac27a2035f4d25768b0814c
β”” @ CUDAnative ~/.julia/1.3.1/daint-mc/packages/CUDAnative/1UYFF/src/bindeps.jl:174
β”Œ Debug: Toolchain with LLVM 6.0.1, CUDA driver 10.1.0 and toolkit 10.1.243 supports devices 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2 and 7.0; PTX 3.2, 4.0, 4.1, 4.2, 4.3, 5.0 and 6.0
β”” @ CUDAnative ~/.julia/1.3.1/daint-mc/packages/CUDAnative/1UYFF/src/bindeps.jl:197
β”Œ Debug: Using CUDA 10.1.0 from an artifact at /users/omlins/.julia/1.3.1/daint-mc/artifacts/f583fb4bf816d52e5ac27a2035f4d25768b0814c
β”” @ CuArrays ~/.julia/1.3.1/daint-mc/packages/CuArrays/lLQny/src/bindeps.jl:76
β”Œ Debug: Using CUDNN from an artifact at /users/omlins/.julia/1.3.1/daint-mc/artifacts/71215498c41fadf9fd76244f5e08ab431fadbd04
β”” @ CuArrays ~/.julia/1.3.1/daint-mc/packages/CuArrays/lLQny/src/bindeps.jl:120
β”Œ Debug: Using CUTENSOR from an artifact at /users/omlins/.julia/1.3.1/daint-mc/artifacts/60b7126923636577370a136e91590e0f28d3147b
β”” @ CuArrays ~/.julia/1.3.1/daint-mc/packages/CuArrays/lLQny/src/bindeps.jl:146
β”Œ Debug: Initializing CUDA on thread 1
β”” @ CUDAnative ~/.julia/1.3.1/daint-mc/packages/CUDAnative/1UYFF/src/init.jl:35
ERROR: LoadError: CUDA error: invalid device ordinal (code 101, ERROR_INVALID_DEVICE)
Stacktrace:
 [1] throw_api_error(::CUDAdrv.cudaError_enum) at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAdrv/b1mvw/src/error.jl:131
 [2] macro expansion at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAdrv/b1mvw/src/error.jl:144 [inlined]
 [3] cuCtxCreate_v2 at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAdrv/b1mvw/src/libcuda.jl:108 [inlined]
 [4] CuContext(::CuDevice, ::CUDAdrv.CUctx_flags_enum) at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAdrv/b1mvw/src/context.jl:73
 [5] CuContext at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAdrv/b1mvw/src/context.jl:72 [inlined]
 [6] CuContext(::var"#5#7", ::CuDevice) at /users/omlins/.julia/1.3.1/daint-mc/packages/CUDAdrv/b1mvw/src/context.jl:118
 [7] iterate at ./none:0 [inlined]
 [8] collect(::Base.Generator{CUDAdrv.DeviceSet,var"#4#6"}) at ./array.jl:622
 [9] top-level scope at /users/omlins/.julia/1.3.1/daint-mc/packages/CuArrays/lLQny/test/runtests.jl:22
 [10] include at ./boot.jl:328 [inlined]
 [11] include_relative(::Module, ::String) at ./loading.jl:1105
 [12] include(::Module, ::String) at ./Base.jl:31
 [13] include(::String) at ./client.jl:424
 [14] top-level scope at none:6
in expression starting at /users/omlins/.julia/1.3.1/daint-mc/packages/CuArrays/lLQny/test/runtests.jl:22
ERROR: Package CuArrays errored during testing

Let me know what I should do to help you to track down the issue.

Thanks!

Great news!

Tested in a Fedora 31 system (5.4.20-200.fc31.x86_64 #1 SMP Mon Feb 17 19:31:55 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux) with: a GeForce GTX 950 (compute capability 5.2.0, 1.654 GiB available memory) on CUDA driver 10.2.0 and toolkit 10.2.89 and Julia 1.4.0-rc1.0 (2020-01-23), with MKL.jl installed.

Got the same Warning 4 times while testing CuArays

β”Œ Warning: calls to Base intrinsics might be GPU incompatible
β”‚ exception =
β”‚ You called hypot(x::T, y::T) where T<:AbstractFloat in Base.Math at math.jl:619, maybe you intended to call hypot(x::Float32, y::Float32) in CUDAnative at /home/lenz/.julia/packages/CUDAnative/1UYFF/src/device/cuda/math.jl:311 instead?
β”‚ Stacktrace:
β”‚ [1] hypot at math.jl:619
β”‚ [2] reduce_kernel at /home/lenz/.julia/packages/GPUArrays/QYxut/src/host/mapreduce.jl:134
β”” @ CUDAnative ~/.julia/packages/CUDAnative/1UYFF/src/compiler/irgen.jl:111

Seems to be OK

Test Summary: | Pass Total
CUDAnative | 508 508

Test Summary: | Pass Total
CuArrays | 4967 4967
Testing CuArrays tests passed

This sounds like https://discourse.julialang.org/t/cuarrays-error-calling-cuarray-error-invalid-device/18654; are you running in process exclusive compute mode? CUDAdrv tests won’t pass then, because as part of the testsuite it creates a bunch of contexts and switches between them. Unrelated to this change.

Thanks for testing, all!

I retried setting CRAY_CUDA_MPS=1 to allow multiple processes on a GPU. It got further in the tests, but there are still a few errors:

omlins@dom101:/scratch/snx1600tds/omlins/test_cujulia_install/julia-1.3.1/bin> salloc -Cgpu -Ausup --time=01:00:00 -N1
salloc: Pending job allocation 936382
salloc: job 936382 queued and waiting for resources
salloc: job 936382 has been allocated resources
salloc: Granted job allocation 936382
omlins@dom101:/scratch/snx1600tds/omlins/test_cujulia_install/julia-1.3.1/bin> srun -n1 -N1 --pty bash
omlins@nid00000:/scratch/snx1600tds/omlins/test_cujulia_install/julia-1.3.1/bin> export CRAY_CUDA_MPS=1
omlins@nid00000:/scratch/snx1600tds/omlins/test_cujulia_install/julia-1.3.1/bin> export JULIA_DEBUG=all
omlins@nid00000:/scratch/snx1600tds/omlins/test_cujulia_install/julia-1.3.1/bin> ./julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.3.1 (2019-12-30)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

(v1.3) pkg> status
    Status `~/.julia/environments/v1.3/Project.toml`
  (empty environment)

(v1.3) pkg> activate .
Activating environment at `/scratch/snx1600tds/omlins/test_cujulia_install/julia-1.3.1/bin/Project.toml`

(bin) pkg>  status
    Status `/scratch/snx1600tds/omlins/test_cujulia_install/julia-1.3.1/bin/Project.toml`
  [be33ccc6] CUDAnative v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [3a865a2d] CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [0c68f7d7] GPUArrays v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)

(bin) pkg> test CUDAnative
   Testing CUDAnative
 Resolving package versions...
    Status `/tmp/jl_VwSNuR/Manifest.toml`
  [621f4979] AbstractFFTs v0.5.0
  [79e6a3ab] Adapt v1.0.1
  [b99e7846] BinaryProvider v0.5.8
  [fa961155] CEnum v0.2.0
  [3895d2a7] CUDAapi v3.1.0
  [c5f51814] CUDAdrv v6.0.0
  [be33ccc6] CUDAnative v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [3a865a2d] CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [864edb3b] DataStructures v0.17.9
  [0c68f7d7] GPUArrays v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)
  [929cbde3] LLVM v1.3.3
  [1914dd2f] MacroTools v0.5.4
  [872c559c] NNlib v0.6.4
  [bac558e1] OrderedCollections v1.1.0
  [189a3867] Reexport v0.2.0
  [ae029012] Requires v1.0.1
  [a759f4b9] TimerOutputs v0.5.3
  [2a0f44e3] Base64  [`@stdlib/Base64`]
  [ade2ca70] Dates  [`@stdlib/Dates`]
  [8ba89e20] Distributed  [`@stdlib/Distributed`]
  [b77e0a4c] InteractiveUtils  [`@stdlib/InteractiveUtils`]
  [76f85450] LibGit2  [`@stdlib/LibGit2`]
  [8f399da3] Libdl  [`@stdlib/Libdl`]
  [37e2e46d] LinearAlgebra  [`@stdlib/LinearAlgebra`]
  [56ddb016] Logging  [`@stdlib/Logging`]
  [d6f4376e] Markdown  [`@stdlib/Markdown`]
  [44cfe95a] Pkg  [`@stdlib/Pkg`]
  [de0858da] Printf  [`@stdlib/Printf`]
  [3fa0cd96] REPL  [`@stdlib/REPL`]
  [9a3f8284] Random  [`@stdlib/Random`]
  [ea8e919c] SHA  [`@stdlib/SHA`]
  [9e88b42a] Serialization  [`@stdlib/Serialization`]
  [6462fe0b] Sockets  [`@stdlib/Sockets`]
  [2f01184e] SparseArrays  [`@stdlib/SparseArrays`]
  [10745b16] Statistics  [`@stdlib/Statistics`]
  [8dfed614] Test  [`@stdlib/Test`]
  [cf7118a7] UUIDs  [`@stdlib/UUIDs`]
  [4ec0a83e] Unicode  [`@stdlib/Unicode`]
β”Œ Debug: Precompiling CUDAnative [be33ccc6-a3ff-5ff2-a52e-74243cff1e17]
β”” @ Base loading.jl:1273
β”Œ Debug: Precompiling CUDAapi [3895d2a7-ec45-59b8-82bb-cfc6a382f9b3]
β”” @ Base loading.jl:1273
β”Œ Debug: Precompiling CUDAdrv [c5f51814-7f29-56b8-a69c-e4d8f6be1fde]
β”” @ Base loading.jl:1273
β”Œ Debug: Precompiling CEnum [fa961155-64e5-5f13-b03f-caf6b980ea82]
β”” @ Base loading.jl:1273
β”Œ Debug: Precompiling LLVM [929cbde3-209d-540e-8aea-75f648917ca0]
β”” @ Base loading.jl:1273
β”Œ Debug: Found LLVM v6.0.1 at /scratch/snx1600tds/omlins/test_cujulia_install/julia-1.3.1/bin/../lib/julia/libLLVM-6.0.so with support for AArch64, AMDGPU, ARC, ARM, AVR, BPF, Hexagon, Lanai, MSP430, Mips, NVPTX, PowerPC, RISCV, Sparc, SystemZ, WebAssembly, X86, XCore
β”” @ LLVM ~/.julia/packages/LLVM/DAnFH/src/LLVM.jl:47
β”Œ Debug: Using LLVM.jl wrapper for LLVM v6.0
β”” @ LLVM ~/.julia/packages/LLVM/DAnFH/src/LLVM.jl:75
β”Œ Debug: Precompiling Adapt [79e6a3ab-5dfb-504d-930d-738a2a938a0e]
β”” @ Base loading.jl:1273
β”Œ Debug: Precompiling TimerOutputs [a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f]
β”” @ Base loading.jl:1273
β”Œ Debug: Precompiling DataStructures [864edb3b-99cc-5e75-8d2d-829cb0a9cfe8]
β”” @ Base loading.jl:1273
β”Œ Debug: Precompiling OrderedCollections [bac558e1-5e72-5ebc-8fee-abe8a469f55d]
β”” @ Base loading.jl:1273
β”Œ Debug: Precompiling MacroTools [1914dd2f-81c6-5fcd-8719-6d5c9610ff09]
β”” @ Base loading.jl:1273
β”Œ Debug: CUDA toolkit identified as 10.2.89
β”” @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:297
β”Œ Debug: Using CUDA 10.2.89 from an artifact at /users/omlins/.julia/artifacts/93956fcdec9ac5ea76289d25066f02c2f4ebe56e
β”” @ CUDAnative ~/.julia/packages/CUDAnative/nQ8Yi/src/bindeps.jl:174
β”Œ Debug: Toolchain with LLVM 6.0.1, CUDA driver 10.2.0 and toolkit 10.2.89 supports devices 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2 and 7.0; PTX 3.2, 4.0, 4.1, 4.2, 4.3, 5.0 and 6.0
β”” @ CUDAnative ~/.julia/packages/CUDAnative/nQ8Yi/src/bindeps.jl:197
β”Œ Debug: Precompiling GPUArrays [0c68f7d7-f131-5f86-a1c3-88cf8149b2d7]
β”” @ Base loading.jl:1273
β”Œ Debug: Precompiling AbstractFFTs [621f4979-c628-5d54-868e-fcf4e3e8185c]
β”” @ Base loading.jl:1273
β”Œ Debug: Precompiling Requires [ae029012-a4dd-5104-9daa-d747884805df]
β”” @ Base loading.jl:1273
β”Œ Debug: Precompiling NNlib [872c559c-99b0-510c-b3b7-b6c96a88d5cd]
β”” @ Base loading.jl:1273
β”Œ Debug: Precompiling Reexport [189a3867-3050-52da-a836-e630ba90ab69]
β”” @ Base loading.jl:1273
[ Info: Testing using device Tesla P100-PCIE-16GB (compute capability 6.0.0, 15.488 GiB available memory) on CUDA driver 10.2.0 and toolkit 10.2.89
[ Info: Building the CUDAnative run-time library for your sm_60 device, this might take a while...
basic usage: Error During Test at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:834
  Got exception outside of a @test
  CUDA error: operation not supported (code 801, ERROR_NOT_SUPPORTED)
  Stacktrace:
   [1] throw_api_error(::CUDAdrv.cudaError_enum) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/error.jl:131
   [2] CuModule(::Array{UInt8,1}, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module.jl:42
   [3] CuModule(::CUDAdrv.CuLinkImage, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module/linker.jl:142
   [4] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:425 [inlined]
   [5] #cufunction#218(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(cufunction), ::var"#hello#336", ::Type{Tuple{}}) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [6] cufunction(::Function, ::Type) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [7] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:179 [inlined]
   [8] macro expansion at ./gcutils.jl:91 [inlined]
   [9] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:176 [inlined]
   [10] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:847 [inlined]
   [11] (::var"#236#339"{var"#hello#336"})() at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:41
   [12] redirect_stdout(::var"#236#339"{var"#hello#336"}, ::IOStream) at ./stream.jl:1152
   [13] #235 at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:40 [inlined]
   [14] #open#271(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(open), ::var"#235#338"{var"#hello#336"}, ::String, ::Vararg{String,N} where N) at ./io.jl:298
   [15] open at ./io.jl:296 [inlined]
   [16] #234 at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:39 [inlined]
   [17] mktemp(::var"#234#337"{var"#hello#336"}, ::String) at ./file.jl:611
   [18] mktemp(::Function) at ./file.jl:609
   [19] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:37
   [20] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:846
   [21] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [22] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:835
   [23] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [24] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:834
   [25] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [26] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:5
   [27] include at ./boot.jl:328 [inlined]
   [28] include_relative(::Module, ::String) at ./loading.jl:1105
   [29] include(::Module, ::String) at ./Base.jl:31
   [30] include(::String) at ./client.jl:424
   [31] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:89
   [32] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [33] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:10
   [34] include at ./boot.jl:328 [inlined]
   [35] include_relative(::Module, ::String) at ./loading.jl:1105
   [36] include(::Module, ::String) at ./Base.jl:31
   [37] include(::String) at ./client.jl:424
   [38] top-level scope at none:6
   [39] eval(::Module, ::Any) at ./boot.jl:330
   [40] exec_options(::Base.JLOptions) at ./client.jl:263
   [41] _start() at ./client.jl:460
  
anonymous functions: Error During Test at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:853
  Got exception outside of a @test
  CUDA error: operation not supported (code 801, ERROR_NOT_SUPPORTED)
  Stacktrace:
   [1] throw_api_error(::CUDAdrv.cudaError_enum) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/error.jl:131
   [2] CuModule(::Array{UInt8,1}, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module.jl:42
   [3] CuModule(::CUDAdrv.CuLinkImage, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module/linker.jl:142
   [4] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:425 [inlined]
   [5] #cufunction#218(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(cufunction), ::var"#1367#hello#340", ::Type{Tuple{}}) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [6] cufunction(::Function, ::Type) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [7] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:179 [inlined]
   [8] macro expansion at ./gcutils.jl:91 [inlined]
   [9] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:176 [inlined]
   [10] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:862 [inlined]
   [11] (::var"#240#344"{var"#1367#hello#340"})() at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:41
   [12] redirect_stdout(::var"#240#344"{var"#1367#hello#340"}, ::IOStream) at ./stream.jl:1152
   [13] #239 at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:40 [inlined]
   [14] #open#271(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(open), ::var"#239#343"{var"#1367#hello#340"}, ::String, ::Vararg{String,N} where N) at ./io.jl:298
   [15] open at ./io.jl:296 [inlined]
   [16] #238 at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:39 [inlined]
   [17] mktemp(::var"#238#342"{var"#1367#hello#340"}, ::String) at ./file.jl:611
   [18] mktemp(::Function) at ./file.jl:609
   [19] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:37
   [20] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:861
   [21] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [22] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:854
   [23] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [24] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:834
   [25] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [26] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:5
   [27] include at ./boot.jl:328 [inlined]
   [28] include_relative(::Module, ::String) at ./loading.jl:1105
   [29] include(::Module, ::String) at ./Base.jl:31
   [30] include(::String) at ./client.jl:424
   [31] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:89
   [32] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [33] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:10
   [34] include at ./boot.jl:328 [inlined]
   [35] include_relative(::Module, ::String) at ./loading.jl:1105
   [36] include(::Module, ::String) at ./Base.jl:31
   [37] include(::String) at ./client.jl:424
   [38] top-level scope at none:6
   [39] eval(::Module, ::Any) at ./boot.jl:330
   [40] exec_options(::Base.JLOptions) at ./client.jl:263
   [41] _start() at ./client.jl:460
  
closures: Error During Test at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:869
  Got exception outside of a @test
  CUDA error: operation not supported (code 801, ERROR_NOT_SUPPORTED)
  Stacktrace:
   [1] throw_api_error(::CUDAdrv.cudaError_enum) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/error.jl:131
   [2] CuModule(::Array{UInt8,1}, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module.jl:42
   [3] CuModule(::CUDAdrv.CuLinkImage, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module/linker.jl:142
   [4] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:425 [inlined]
   [5] #cufunction#218(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(cufunction), ::var"#1369#hello#345", ::Type{Tuple{}}) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [6] cufunction(::Function, ::Type) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [7] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:179 [inlined]
   [8] macro expansion at ./gcutils.jl:91 [inlined]
   [9] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:176 [inlined]
   [10] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:879 [inlined]
   [11] (::var"#244#349"{var"#1369#hello#345"})() at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:41
   [12] redirect_stdout(::var"#244#349"{var"#1369#hello#345"}, ::IOStream) at ./stream.jl:1152
   [13] #243 at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:40 [inlined]
   [14] #open#271(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(open), ::var"#243#348"{var"#1369#hello#345"}, ::String, ::Vararg{String,N} where N) at ./io.jl:298
   [15] open at ./io.jl:296 [inlined]
   [16] #242 at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:39 [inlined]
   [17] mktemp(::var"#242#347"{var"#1369#hello#345"}, ::String) at ./file.jl:611
   [18] mktemp(::Function) at ./file.jl:609
   [19] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:37
   [20] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:878
   [21] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [22] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:870
   [23] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [24] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:834
   [25] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [26] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:5
   [27] include at ./boot.jl:328 [inlined]
   [28] include_relative(::Module, ::String) at ./loading.jl:1105
   [29] include(::Module, ::String) at ./Base.jl:31
   [30] include(::String) at ./client.jl:424
   [31] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:89
   [32] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [33] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:10
   [34] include at ./boot.jl:328 [inlined]
   [35] include_relative(::Module, ::String) at ./loading.jl:1105
   [36] include(::Module, ::String) at ./Base.jl:31
   [37] include(::String) at ./client.jl:424
   [38] top-level scope at none:6
   [39] eval(::Module, ::Any) at ./boot.jl:330
   [40] exec_options(::Base.JLOptions) at ./client.jl:263
   [41] _start() at ./client.jl:460
  
argument passing: Error During Test at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:886
  Got exception outside of a @test
  CUDA error: operation not supported (code 801, ERROR_NOT_SUPPORTED)
  Stacktrace:
   [1] throw_api_error(::CUDAdrv.cudaError_enum) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/error.jl:131
   [2] CuModule(::Array{UInt8,1}, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module.jl:42
   [3] CuModule(::CUDAdrv.CuLinkImage, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module/linker.jl:142
   [4] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:425 [inlined]
   [5] #cufunction#218(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(cufunction), ::var"#1371#kernel#350", ::Type{Tuple{CuDeviceArray{Int64,1,CUDAnative.AS.Global}}}) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [6] cufunction(::Function, ::Type) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [7] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:179
   [8] top-level scope at gcutils.jl:91
   [9] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:176
   [10] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:918
   [11] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [12] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:889
   [13] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [14] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:834
   [15] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [16] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:5
   [17] include at ./boot.jl:328 [inlined]
   [18] include_relative(::Module, ::String) at ./loading.jl:1105
   [19] include(::Module, ::String) at ./Base.jl:31
   [20] include(::String) at ./client.jl:424
   [21] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:89
   [22] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [23] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:10
   [24] include at ./boot.jl:328 [inlined]
   [25] include_relative(::Module, ::String) at ./loading.jl:1105
   [26] include(::Module, ::String) at ./Base.jl:31
   [27] include(::String) at ./client.jl:424
   [28] top-level scope at none:6
   [29] eval(::Module, ::Any) at ./boot.jl:330
   [30] exec_options(::Base.JLOptions) at ./client.jl:263
   [31] _start() at ./client.jl:460
  
self-recursion: Error During Test at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:923
  Got exception outside of a @test
  CUDA error: operation not supported (code 801, ERROR_NOT_SUPPORTED)
  Stacktrace:
   [1] throw_api_error(::CUDAdrv.cudaError_enum) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/error.jl:131
   [2] CuModule(::Array{UInt8,1}, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module.jl:42
   [3] CuModule(::CUDAdrv.CuLinkImage, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module/linker.jl:142
   [4] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:425 [inlined]
   [5] #cufunction#218(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(cufunction), ::typeof(kernel), ::Type{Tuple{Bool}}) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [6] cufunction(::Function, ::Type) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [7] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:179 [inlined]
   [8] macro expansion at ./gcutils.jl:91 [inlined]
   [9] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:176 [inlined]
   [10] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:935 [inlined]
   [11] (::var"#250#357")() at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:41
   [12] redirect_stdout(::var"#250#357", ::IOStream) at ./stream.jl:1152
   [13] #249 at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:40 [inlined]
   [14] #open#271(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(open), ::var"#249#356", ::String, ::Vararg{String,N} where N) at ./io.jl:298
   [15] open at ./io.jl:296 [inlined]
   [16] #248 at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:39 [inlined]
   [17] mktemp(::var"#248#355", ::String) at ./file.jl:611
   [18] mktemp(::Function) at ./file.jl:609
   [19] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:37
   [20] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:934
   [21] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [22] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:924
   [23] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [24] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:834
   [25] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [26] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:5
   [27] include at ./boot.jl:328 [inlined]
   [28] include_relative(::Module, ::String) at ./loading.jl:1105
   [29] include(::Module, ::String) at ./Base.jl:31
   [30] include(::String) at ./client.jl:424
   [31] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:89
   [32] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [33] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:10
   [34] include at ./boot.jl:328 [inlined]
   [35] include_relative(::Module, ::String) at ./loading.jl:1105
   [36] include(::Module, ::String) at ./Base.jl:31
   [37] include(::String) at ./client.jl:424
   [38] top-level scope at none:6
   [39] eval(::Module, ::Any) at ./boot.jl:330
   [40] exec_options(::Base.JLOptions) at ./client.jl:263
   [41] _start() at ./client.jl:460
  
deep recursion: Error During Test at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:941
  Got exception outside of a @test
  CUDA error: operation not supported (code 801, ERROR_NOT_SUPPORTED)
  Stacktrace:
   [1] throw_api_error(::CUDAdrv.cudaError_enum) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/error.jl:131
   [2] CuModule(::Array{UInt8,1}, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module.jl:42
   [3] CuModule(::CUDAdrv.CuLinkImage, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module/linker.jl:142
   [4] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:425 [inlined]
   [5] #cufunction#218(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(cufunction), ::typeof(kernel_a), ::Type{Tuple{Bool}}) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [6] cufunction(::Function, ::Type) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [7] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:179 [inlined]
   [8] macro expansion at ./gcutils.jl:91 [inlined]
   [9] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:176 [inlined]
   [10] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:966 [inlined]
   [11] (::var"#253#360")() at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:41
   [12] redirect_stdout(::var"#253#360", ::IOStream) at ./stream.jl:1152
   [13] #252 at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:40 [inlined]
   [14] #open#271(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(open), ::var"#252#359", ::String, ::Vararg{String,N} where N) at ./io.jl:298
   [15] open at ./io.jl:296 [inlined]
   [16] #251 at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:39 [inlined]
   [17] mktemp(::var"#251#358", ::String) at ./file.jl:611
   [18] mktemp(::Function) at ./file.jl:609
   [19] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:37
   [20] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:965
   [21] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [22] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:942
   [23] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [24] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:834
   [25] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [26] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:5
   [27] include at ./boot.jl:328 [inlined]
   [28] include_relative(::Module, ::String) at ./loading.jl:1105
   [29] include(::Module, ::String) at ./Base.jl:31
   [30] include(::String) at ./client.jl:424
   [31] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:89
   [32] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [33] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:10
   [34] include at ./boot.jl:328 [inlined]
   [35] include_relative(::Module, ::String) at ./loading.jl:1105
   [36] include(::Module, ::String) at ./Base.jl:31
   [37] include(::String) at ./client.jl:424
   [38] top-level scope at none:6
   [39] eval(::Module, ::Any) at ./boot.jl:330
   [40] exec_options(::Base.JLOptions) at ./client.jl:263
   [41] _start() at ./client.jl:460
  

Continuation of the error (as there were to many characters for posting in one reply):


streams: Error During Test at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:972
  Got exception outside of a @test
  CUDA error: operation not supported (code 801, ERROR_NOT_SUPPORTED)
  Stacktrace:
   [1] throw_api_error(::CUDAdrv.cudaError_enum) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/error.jl:131
   [2] CuModule(::Array{UInt8,1}, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module.jl:42
   [3] CuModule(::CUDAdrv.CuLinkImage, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module/linker.jl:142
   [4] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:425 [inlined]
   [5] #cufunction#218(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(cufunction), ::var"#1378#hello#361", ::Type{Tuple{}}) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [6] cufunction(::Function, ::Type) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [7] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:179 [inlined]
   [8] macro expansion at ./gcutils.jl:91 [inlined]
   [9] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:176 [inlined]
   [10] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:987 [inlined]
   [11] (::var"#256#364"{var"#1378#hello#361"})() at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:41
   [12] redirect_stdout(::var"#256#364"{var"#1378#hello#361"}, ::IOStream) at ./stream.jl:1152
   [13] #255 at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:40 [inlined]
   [14] #open#271(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(open), ::var"#255#363"{var"#1378#hello#361"}, ::String, ::Vararg{String,N} where N) at ./io.jl:298
   [15] open at ./io.jl:296 [inlined]
   [16] #254 at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:39 [inlined]
   [17] mktemp(::var"#254#362"{var"#1378#hello#361"}, ::String) at ./file.jl:611
   [18] mktemp(::Function) at ./file.jl:609
   [19] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/util.jl:37
   [20] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:986
   [21] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [22] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:973
   [23] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [24] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:834
   [25] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [26] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/execution.jl:5
   [27] include at ./boot.jl:328 [inlined]
   [28] include_relative(::Module, ::String) at ./loading.jl:1105
   [29] include(::Module, ::String) at ./Base.jl:31
   [30] include(::String) at ./client.jl:424
   [31] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:89
   [32] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [33] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:10
   [34] include at ./boot.jl:328 [inlined]
   [35] include_relative(::Module, ::String) at ./loading.jl:1105
   [36] include(::Module, ::String) at ./Base.jl:31
   [37] include(::String) at ./client.jl:424
   [38] top-level scope at none:6
   [39] eval(::Module, ::Any) at ./boot.jl:330
   [40] exec_options(::Base.JLOptions) at ./client.jl:263
   [41] _start() at ./client.jl:460
  
libcudadevrt: Error During Test at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/cuda.jl:910
  Got exception outside of a @test
  CUDA error: operation not supported (code 801, ERROR_NOT_SUPPORTED)
  Stacktrace:
   [1] throw_api_error(::CUDAdrv.cudaError_enum) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/error.jl:131
   [2] CuModule(::Array{UInt8,1}, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module.jl:42
   [3] CuModule(::CUDAdrv.CuLinkImage, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/module/linker.jl:142
   [4] macro expansion at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:425 [inlined]
   [5] #cufunction#218(::Nothing, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(cufunction), ::var"#3572#kernel#610", ::Type{Tuple{}}) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [6] cufunction(::Function, ::Type) at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:360
   [7] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:179
   [8] top-level scope at gcutils.jl:91
   [9] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/src/execution.jl:176
   [10] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/cuda.jl:912
   [11] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [12] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/cuda.jl:911
   [13] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [14] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/device/cuda.jl:5
   [15] include at ./boot.jl:328 [inlined]
   [16] include_relative(::Module, ::String) at ./loading.jl:1105
   [17] include(::Module, ::String) at ./Base.jl:31
   [18] include(::String) at ./client.jl:424
   [19] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:92
   [20] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [21] top-level scope at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:10
   [22] include at ./boot.jl:328 [inlined]
   [23] include_relative(::Module, ::String) at ./loading.jl:1105
   [24] include(::Module, ::String) at ./Base.jl:31
   [25] include(::String) at ./client.jl:424
   [26] top-level scope at none:6
   [27] eval(::Module, ::Any) at ./boot.jl:330
   [28] exec_options(::Base.JLOptions) at ./client.jl:263
   [29] _start() at ./client.jl:460
  
Test Summary:                                  | Pass  Error  Total
CUDAnative                                     |  519      8    527
  base interface                               |              No tests
  pointer                                      |   20            20
  code generation                              |   93            93
  code generation (relying on a device)        |    8             8
  execution                                    |   69      7     76
    @cuda                                      |   13            13
    argument passing                           |   28            28
    exceptions                                 |   17            17
    shmem divergence bug                       |    7             7
    dynamic parallelism                        |    4      7     11
      basic usage                              |           1      1
      anonymous functions                      |           1      1
      closures                                 |           1      1
      argument passing                         |    4      1      5
      self-recursion                           |           1      1
      deep recursion                           |           1      1
      streams                                  |           1      1
  pointer                                      |   41            41
  device arrays                                |   20            20
  CUDA functionality                           |  253      1    254
    indexing                                   |    1             1
    math                                       |   71            71
    formatted output                           |    6             6
    @cuprint                                   |   27            27
    assertion                                  |              No tests
    shared memory                              |   14            14
    data movement and conversion               |    6             6
    clock and nanosleep                        |              No tests
    parallel synchronization and communication |   16            16
    libcudadevrt                               |           1      1
    atomics (low-level)                        |   50            50
    atomics (high-level)                       |   62            62
  NVTX                                         |              No tests
  examples                                     |    8             8
ERROR: LoadError: Some tests did not pass: 519 passed, 0 failed, 8 errored, 0 broken.
in expression starting at /users/omlins/.julia/packages/CUDAnative/nQ8Yi/test/runtests.jl:8
ERROR: Package CUDAnative errored during testing

(bin) pkg> 
(bin) pkg> test CuArrays
   Testing CuArrays
 Resolving package versions...
 Installed NaNMath ────────────── v0.3.3
 Installed MKL_jll ────────────── v2019.0.117+2
 Installed IntelOpenMP_jll ────── v2018.0.3+0
 Installed DiffResults ────────── v1.0.2
 Installed FillArrays ─────────── v0.8.5
 Installed FFTW ───────────────── v1.2.0
 Installed DiffRules ──────────── v1.0.1
 Installed ForwardDiff ────────── v0.10.9
 Installed FFTW_jll ───────────── v3.3.9+4
 Installed SpecialFunctions ───── v0.10.0
 Installed OpenSpecFun_jll ────── v0.5.3+1
 Installed CommonSubexpressions ─ v0.2.0
 Installed StaticArrays ───────── v0.12.1
  Building FFTW β†’ `~/.julia/packages/FFTW/qqcBj/deps/build.log`
    Status `/tmp/jl_3bJiBU/Manifest.toml`
  [621f4979] AbstractFFTs v0.5.0
  [79e6a3ab] Adapt v1.0.1
  [b99e7846] BinaryProvider v0.5.8
  [fa961155] CEnum v0.2.0
  [3895d2a7] CUDAapi v3.1.0
  [c5f51814] CUDAdrv v6.0.0
  [be33ccc6] CUDAnative v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [bbf7d656] CommonSubexpressions v0.2.0
  [3a865a2d] CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [864edb3b] DataStructures v0.17.9
  [163ba53b] DiffResults v1.0.2
  [b552c78f] DiffRules v1.0.1
  [7a1cc6ca] FFTW v1.2.0
  [f5851436] FFTW_jll v3.3.9+4
  [1a297f60] FillArrays v0.8.5
  [f6369f11] ForwardDiff v0.10.9
  [0c68f7d7] GPUArrays v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)
  [1d5cc7b8] IntelOpenMP_jll v2018.0.3+0
  [929cbde3] LLVM v1.3.3
  [856f044c] MKL_jll v2019.0.117+2
  [1914dd2f] MacroTools v0.5.4
  [872c559c] NNlib v0.6.4
  [77ba4419] NaNMath v0.3.3
  [efe28fd5] OpenSpecFun_jll v0.5.3+1
  [bac558e1] OrderedCollections v1.1.0
  [189a3867] Reexport v0.2.0
  [ae029012] Requires v1.0.1
  [276daf66] SpecialFunctions v0.10.0
  [90137ffa] StaticArrays v0.12.1
  [a759f4b9] TimerOutputs v0.5.3
  [2a0f44e3] Base64  [`@stdlib/Base64`]
  [ade2ca70] Dates  [`@stdlib/Dates`]
  [8ba89e20] Distributed  [`@stdlib/Distributed`]
  [b77e0a4c] InteractiveUtils  [`@stdlib/InteractiveUtils`]
  [76f85450] LibGit2  [`@stdlib/LibGit2`]
  [8f399da3] Libdl  [`@stdlib/Libdl`]
  [37e2e46d] LinearAlgebra  [`@stdlib/LinearAlgebra`]
  [56ddb016] Logging  [`@stdlib/Logging`]
  [d6f4376e] Markdown  [`@stdlib/Markdown`]
  [44cfe95a] Pkg  [`@stdlib/Pkg`]
  [de0858da] Printf  [`@stdlib/Printf`]
  [3fa0cd96] REPL  [`@stdlib/REPL`]
  [9a3f8284] Random  [`@stdlib/Random`]
  [ea8e919c] SHA  [`@stdlib/SHA`]
  [9e88b42a] Serialization  [`@stdlib/Serialization`]
  [6462fe0b] Sockets  [`@stdlib/Sockets`]
  [2f01184e] SparseArrays  [`@stdlib/SparseArrays`]
  [10745b16] Statistics  [`@stdlib/Statistics`]
  [8dfed614] Test  [`@stdlib/Test`]
  [cf7118a7] UUIDs  [`@stdlib/UUIDs`]
  [4ec0a83e] Unicode  [`@stdlib/Unicode`]
β”Œ Debug: CUDA toolkit identified as 10.2.89
β”” @ CUDAapi ~/.julia/packages/CUDAapi/wYUAO/src/discovery.jl:297
β”Œ Debug: Using CUDA 10.2.89 from an artifact at /users/omlins/.julia/artifacts/93956fcdec9ac5ea76289d25066f02c2f4ebe56e
β”” @ CUDAnative ~/.julia/packages/CUDAnative/nQ8Yi/src/bindeps.jl:174
β”Œ Debug: Toolchain with LLVM 6.0.1, CUDA driver 10.2.0 and toolkit 10.2.89 supports devices 3.0, 3.2, 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2 and 7.0; PTX 3.2, 4.0, 4.1, 4.2, 4.3, 5.0 and 6.0
β”” @ CUDAnative ~/.julia/packages/CUDAnative/nQ8Yi/src/bindeps.jl:197
β”Œ Debug: Using CUDA 10.2.0 from an artifact at /users/omlins/.julia/artifacts/93956fcdec9ac5ea76289d25066f02c2f4ebe56e
β”” @ CuArrays ~/.julia/packages/CuArrays/g3teL/src/bindeps.jl:76
β”Œ Debug: Using CUDNN from an artifact at /users/omlins/.julia/artifacts/583aee6a50385a6636638b2d170626ad74b44317
β”” @ CuArrays ~/.julia/packages/CuArrays/g3teL/src/bindeps.jl:120
β”Œ Debug: Using CUTENSOR from an artifact at /users/omlins/.julia/artifacts/2efa5337c181b4a3883d8dcbd4e1bc3642dbad8b
β”” @ CuArrays ~/.julia/packages/CuArrays/g3teL/src/bindeps.jl:146
β”Œ Debug: Precompiling FFTW [7a1cc6ca-52ef-59f5-83cd-3a7055c09341]
β”” @ Base loading.jl:1273
β”Œ Debug: Precompiling FFTW_jll [f5851436-0d7a-5f13-b9de-f02708fd171a]
β”” @ Base loading.jl:1273
β”Œ Debug: Precompiling FillArrays [1a297f60-69ca-5386-bcde-b61e274b549b]
β”” @ Base loading.jl:1273
β”Œ Debug: Initializing CUDA on thread 1
β”” @ CUDAnative ~/.julia/packages/CUDAnative/nQ8Yi/src/init.jl:35
ERROR: LoadError: CUDA error: invalid device ordinal (code 101, ERROR_INVALID_DEVICE)
Stacktrace:
 [1] throw_api_error(::CUDAdrv.cudaError_enum) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/error.jl:131
 [2] macro expansion at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/error.jl:144 [inlined]
 [3] cuCtxCreate_v2 at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/libcuda.jl:108 [inlined]
 [4] CuContext(::CuDevice, ::CUDAdrv.CUctx_flags_enum) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/context.jl:73
 [5] CuContext at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/context.jl:72 [inlined]
 [6] CuContext(::var"#5#7", ::CuDevice) at /users/omlins/.julia/packages/CUDAdrv/b1mvw/src/context.jl:118
 [7] iterate at ./none:0 [inlined]
 [8] collect(::Base.Generator{CUDAdrv.DeviceSet,var"#4#6"}) at ./array.jl:622
 [9] top-level scope at /users/omlins/.julia/packages/CuArrays/g3teL/test/runtests.jl:22
 [10] include at ./boot.jl:328 [inlined]
 [11] include_relative(::Module, ::String) at ./loading.jl:1105
 [12] include(::Module, ::String) at ./Base.jl:31
 [13] include(::String) at ./client.jl:424
 [14] top-level scope at none:6
in expression starting at /users/omlins/.julia/packages/CuArrays/g3teL/test/runtests.jl:22
ERROR: Package CuArrays errored during testing

(bin) pkg> 

Interesting. These errors are not related to BinaryBuilder though, and probably indicate the tests are using some CUDA feature that is unsupported by your hardware. Please open an issue on CUDAnative.jl

Here is what I’m getting:

(v1.3) pkg> add CUDAnative#master CuArrays#master GPUArrays#master
  Updating registry at `~/.julia/registries/General`
  Updating git-repo `https://github.com/JuliaRegistries/General.git`
  Updating git-repo `https://github.com/JuliaGPU/CUDAnative.jl.git`
  Updating git-repo `https://github.com/JuliaGPU/CuArrays.jl.git`
  Updating git-repo `https://github.com/JuliaGPU/GPUArrays.jl.git`
 Resolving package versions...
  Updating `~/.julia/environments/v1.3/Project.toml`
  [be33ccc6] ~ CUDAnative v2.10.2 β‡’ v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [3a865a2d] ↓ CuArrays v1.7.2 β‡’ v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [0c68f7d7] ~ GPUArrays v2.0.1 β‡’ v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)
  Updating `~/.julia/environments/v1.3/Manifest.toml`
  [00ebfdb7] ↑ CSTParser v1.1.0 β‡’ v2.1.0
  [be33ccc6] ~ CUDAnative v2.10.2 β‡’ v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [3a865a2d] ↓ CuArrays v1.7.2 β‡’ v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [0c68f7d7] ~ GPUArrays v2.0.1 β‡’ v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)

(v1.3) pkg> test CUDAnative CuArrays
   Testing CUDAnative
 Resolving package versions...
    Status `/tmp/jl_vbB12e/Manifest.toml`
  [621f4979] AbstractFFTs v0.5.0
  [79e6a3ab] Adapt v1.0.1
  [b99e7846] BinaryProvider v0.5.8
  [fa961155] CEnum v0.2.0
  [3895d2a7] CUDAapi v3.1.0
  [c5f51814] CUDAdrv v6.0.0
  [be33ccc6] CUDAnative v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [3a865a2d] CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [864edb3b] DataStructures v0.17.10
  [0c68f7d7] GPUArrays v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)
  [929cbde3] LLVM v1.3.3
  [1914dd2f] MacroTools v0.5.4
  [872c559c] NNlib v0.6.4
  [bac558e1] OrderedCollections v1.1.0
  [189a3867] Reexport v0.2.0
  [ae029012] Requires v1.0.1
  [a759f4b9] TimerOutputs v0.5.3
  [2a0f44e3] Base64  [`@stdlib/Base64`]
  [ade2ca70] Dates  [`@stdlib/Dates`]
  [8ba89e20] Distributed  [`@stdlib/Distributed`]
  [b77e0a4c] InteractiveUtils  [`@stdlib/InteractiveUtils`]
  [76f85450] LibGit2  [`@stdlib/LibGit2`]
  [8f399da3] Libdl  [`@stdlib/Libdl`]
  [37e2e46d] LinearAlgebra  [`@stdlib/LinearAlgebra`]
  [56ddb016] Logging  [`@stdlib/Logging`]
  [d6f4376e] Markdown  [`@stdlib/Markdown`]
  [44cfe95a] Pkg  [`@stdlib/Pkg`]
  [de0858da] Printf  [`@stdlib/Printf`]
  [3fa0cd96] REPL  [`@stdlib/REPL`]
  [9a3f8284] Random  [`@stdlib/Random`]
  [ea8e919c] SHA  [`@stdlib/SHA`]
  [9e88b42a] Serialization  [`@stdlib/Serialization`]
  [6462fe0b] Sockets  [`@stdlib/Sockets`]
  [2f01184e] SparseArrays  [`@stdlib/SparseArrays`]
  [10745b16] Statistics  [`@stdlib/Statistics`]
  [8dfed614] Test  [`@stdlib/Test`]
  [cf7118a7] UUIDs  [`@stdlib/UUIDs`]
  [4ec0a83e] Unicode  [`@stdlib/Unicode`]
[ Info: Testing using device GeForce GTX 1080 Ti (compute capability 6.1.0, 9.919 GiB available memory) on CUDA driver 10.2.0 and toolkit 10.2.89
[ Info: Building the CUDAnative run-time library for your sm_61 device, this might take a while...
ERROR: LoadError: LoadError: UndefVarError: AbstractGPUArray not defined
Stacktrace:
 [1] top-level scope at /home/azamat/.julia/packages/CuArrays/ks5EI/src/array.jl:1
 [2] include at ./boot.jl:328 [inlined]
 [3] include_relative(::Module, ::String) at ./loading.jl:1105
 [4] include at ./Base.jl:31 [inlined]
 [5] include(::String) at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:1
 [6] top-level scope at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:25
 [7] include at ./boot.jl:328 [inlined]
 [8] include_relative(::Module, ::String) at ./loading.jl:1105
 [9] include(::Module, ::String) at ./Base.jl:31
 [10] top-level scope at none:2
 [11] eval at ./boot.jl:330 [inlined]
 [12] eval(::Expr) at ./client.jl:425
 [13] top-level scope at ./none:3
in expression starting at /home/azamat/.julia/packages/CuArrays/ks5EI/src/array.jl:1
in expression starting at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:25
ERROR: LoadError: Failed to precompile CuArrays [3a865a2d-5b23-5a0f-bc46-62713ec82fae] to /home/azamat/.julia/compiled/v1.3/CuArrays/7YFE0_W0wbc.ji.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1283
 [3] _require(::Base.PkgId) at ./loading.jl:1024
 [4] require(::Base.PkgId) at ./loading.jl:922
 [5] require(::Module, ::Symbol) at ./loading.jl:917
 [6] include at ./boot.jl:328 [inlined]
 [7] include_relative(::Module, ::String) at ./loading.jl:1105
 [8] include(::Module, ::String) at ./Base.jl:31
 [9] exec_options(::Base.JLOptions) at ./client.jl:287
 [10] _start() at ./client.jl:460
in expression starting at /home/azamat/.julia/packages/CUDAnative/sivOw/examples/pairwise.jl:3
example = pairwise.jl: Test Failed at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
  Expression: rv
Stacktrace:
 [1] (::var"#699#702"{String})() at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
 [2] cd(::var"#699#702"{String}, ::String) at ./file.jl:104
 [3] top-level scope at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:18
 [4] top-level scope at /home/azamat/julia-1.3.1/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
 [5] top-level scope at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:3
ERROR: LoadError: LoadError: UndefVarError: AbstractGPUArray not defined
Stacktrace:
 [1] top-level scope at /home/azamat/.julia/packages/CuArrays/ks5EI/src/array.jl:1
 [2] include at ./boot.jl:328 [inlined]
 [3] include_relative(::Module, ::String) at ./loading.jl:1105
 [4] include at ./Base.jl:31 [inlined]
 [5] include(::String) at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:1
 [6] top-level scope at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:25
 [7] include at ./boot.jl:328 [inlined]
 [8] include_relative(::Module, ::String) at ./loading.jl:1105
 [9] include(::Module, ::String) at ./Base.jl:31
 [10] top-level scope at none:2
 [11] eval at ./boot.jl:330 [inlined]
 [12] eval(::Expr) at ./client.jl:425
 [13] top-level scope at ./none:3
in expression starting at /home/azamat/.julia/packages/CuArrays/ks5EI/src/array.jl:1
in expression starting at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:25
ERROR: LoadError: Failed to precompile CuArrays [3a865a2d-5b23-5a0f-bc46-62713ec82fae] to /home/azamat/.julia/compiled/v1.3/CuArrays/7YFE0_W0wbc.ji.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1283
 [3] _require(::Base.PkgId) at ./loading.jl:1024
 [4] require(::Base.PkgId) at ./loading.jl:922
 [5] require(::Module, ::Symbol) at ./loading.jl:917
 [6] include at ./boot.jl:328 [inlined]
 [7] include_relative(::Module, ::String) at ./loading.jl:1105
 [8] include(::Module, ::String) at ./Base.jl:31
 [9] exec_options(::Base.JLOptions) at ./client.jl:287
 [10] _start() at ./client.jl:460
in expression starting at /home/azamat/.julia/packages/CUDAnative/sivOw/examples/peakflops.jl:1
example = peakflops.jl: Test Failed at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
  Expression: rv
Stacktrace:
 [1] (::var"#699#702"{String})() at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
 [2] cd(::var"#699#702"{String}, ::String) at ./file.jl:104
 [3] top-level scope at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:18
 [4] top-level scope at /home/azamat/julia-1.3.1/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
 [5] top-level scope at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:3
ERROR: LoadError: LoadError: UndefVarError: AbstractGPUArray not defined
Stacktrace:
 [1] top-level scope at /home/azamat/.julia/packages/CuArrays/ks5EI/src/array.jl:1
 [2] include at ./boot.jl:328 [inlined]
 [3] include_relative(::Module, ::String) at ./loading.jl:1105
 [4] include at ./Base.jl:31 [inlined]
 [5] include(::String) at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:1
 [6] top-level scope at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:25
 [7] include at ./boot.jl:328 [inlined]
 [8] include_relative(::Module, ::String) at ./loading.jl:1105
 [9] include(::Module, ::String) at ./Base.jl:31
 [10] top-level scope at none:2
 [11] eval at ./boot.jl:330 [inlined]
 [12] eval(::Expr) at ./client.jl:425
 [13] top-level scope at ./none:3
in expression starting at /home/azamat/.julia/packages/CuArrays/ks5EI/src/array.jl:1
in expression starting at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:25
ERROR: LoadError: LoadError: Failed to precompile CuArrays [3a865a2d-5b23-5a0f-bc46-62713ec82fae] to /home/azamat/.julia/compiled/v1.3/CuArrays/7YFE0_W0wbc.ji.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1283
 [3] _require(::Base.PkgId) at ./loading.jl:1024
 [4] require(::Base.PkgId) at ./loading.jl:922
 [5] require(::Module, ::Symbol) at ./loading.jl:917
 [6] include at ./boot.jl:328 [inlined]
 [7] include_relative(::Module, ::String) at ./loading.jl:1105
 [8] include(::Module, ::String) at ./Base.jl:31
 [9] include(::String) at ./client.jl:424
 [10] top-level scope at /home/azamat/.julia/packages/CUDAnative/sivOw/examples/reduce/verify.jl:3
 [11] include at ./boot.jl:328 [inlined]
 [12] include_relative(::Module, ::String) at ./loading.jl:1105
 [13] include(::Module, ::String) at ./Base.jl:31
 [14] exec_options(::Base.JLOptions) at ./client.jl:287
 [15] _start() at ./client.jl:460
in expression starting at /home/azamat/.julia/packages/CUDAnative/sivOw/examples/reduce/reduce.jl:10
in expression starting at /home/azamat/.julia/packages/CUDAnative/sivOw/examples/reduce/verify.jl:3
example = reduce/verify.jl: Test Failed at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
  Expression: rv
Stacktrace:
 [1] (::var"#699#702"{String})() at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
 [2] cd(::var"#699#702"{String}, ::String) at ./file.jl:104
 [3] top-level scope at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:18
 [4] top-level scope at /home/azamat/julia-1.3.1/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
 [5] top-level scope at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:3
ERROR: LoadError: LoadError: UndefVarError: AbstractGPUArray not defined
Stacktrace:
 [1] top-level scope at /home/azamat/.julia/packages/CuArrays/ks5EI/src/array.jl:1
 [2] include at ./boot.jl:328 [inlined]
 [3] include_relative(::Module, ::String) at ./loading.jl:1105
 [4] include at ./Base.jl:31 [inlined]
 [5] include(::String) at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:1
 [6] top-level scope at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:25
 [7] include at ./boot.jl:328 [inlined]
 [8] include_relative(::Module, ::String) at ./loading.jl:1105
 [9] include(::Module, ::String) at ./Base.jl:31
 [10] top-level scope at none:2
 [11] eval at ./boot.jl:330 [inlined]
 [12] eval(::Expr) at ./client.jl:425
 [13] top-level scope at ./none:3
in expression starting at /home/azamat/.julia/packages/CuArrays/ks5EI/src/array.jl:1
in expression starting at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:25
ERROR: LoadError: Failed to precompile CuArrays [3a865a2d-5b23-5a0f-bc46-62713ec82fae] to /home/azamat/.julia/compiled/v1.3/CuArrays/7YFE0_W0wbc.ji.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1283
 [3] _require(::Base.PkgId) at ./loading.jl:1024
 [4] require(::Base.PkgId) at ./loading.jl:922
 [5] require(::Module, ::Symbol) at ./loading.jl:917
 [6] include at ./boot.jl:328 [inlined]
 [7] include_relative(::Module, ::String) at ./loading.jl:1105
 [8] include(::Module, ::String) at ./Base.jl:31
 [9] exec_options(::Base.JLOptions) at ./client.jl:287
 [10] _start() at ./client.jl:460
in expression starting at /home/azamat/.julia/packages/CUDAnative/sivOw/examples/scan.jl:6
example = scan.jl: Test Failed at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
  Expression: rv
Stacktrace:
 [1] (::var"#699#702"{String})() at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
 [2] cd(::var"#699#702"{String}, ::String) at ./file.jl:104
 [3] top-level scope at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:18
 [4] top-level scope at /home/azamat/julia-1.3.1/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
 [5] top-level scope at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:3
ERROR: LoadError: LoadError: UndefVarError: AbstractGPUArray not defined
Stacktrace:
 [1] top-level scope at /home/azamat/.julia/packages/CuArrays/ks5EI/src/array.jl:1
 [2] include at ./boot.jl:328 [inlined]
 [3] include_relative(::Module, ::String) at ./loading.jl:1105
 [4] include at ./Base.jl:31 [inlined]
 [5] include(::String) at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:1
 [6] top-level scope at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:25
 [7] include at ./boot.jl:328 [inlined]
 [8] include_relative(::Module, ::String) at ./loading.jl:1105
 [9] include(::Module, ::String) at ./Base.jl:31
 [10] top-level scope at none:2
 [11] eval at ./boot.jl:330 [inlined]
 [12] eval(::Expr) at ./client.jl:425
 [13] top-level scope at ./none:3
in expression starting at /home/azamat/.julia/packages/CuArrays/ks5EI/src/array.jl:1
in expression starting at /home/azamat/.julia/packages/CuArrays/ks5EI/src/CuArrays.jl:25
ERROR: LoadError: Failed to precompile CuArrays [3a865a2d-5b23-5a0f-bc46-62713ec82fae] to /home/azamat/.julia/compiled/v1.3/CuArrays/7YFE0_W0wbc.ji.
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1283
 [3] _require(::Base.PkgId) at ./loading.jl:1024
 [4] require(::Base.PkgId) at ./loading.jl:922
 [5] require(::Module, ::Symbol) at ./loading.jl:917
 [6] include at ./boot.jl:328 [inlined]
 [7] include_relative(::Module, ::String) at ./loading.jl:1105
 [8] include(::Module, ::String) at ./Base.jl:31
 [9] exec_options(::Base.JLOptions) at ./client.jl:287
 [10] _start() at ./client.jl:460
in expression starting at /home/azamat/.julia/packages/CUDAnative/sivOw/examples/vadd.jl:3
example = vadd.jl: Test Failed at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
  Expression: rv
Stacktrace:
 [1] (::var"#699#702"{String})() at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
 [2] cd(::var"#699#702"{String}, ::String) at ./file.jl:104
 [3] top-level scope at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:18
 [4] top-level scope at /home/azamat/julia-1.3.1/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
 [5] top-level scope at /home/azamat/.julia/packages/CUDAnative/sivOw/test/examples.jl:3
Test Summary:                           | Pass  Fail  Total
CUDAnative                              |  522     5    527
  base interface                        |             No tests
  pointer                               |   20           20
  code generation                       |   93           93
  code generation (relying on a device) |    8            8
  execution                             |   77           77
  pointer                               |   41           41
  device arrays                         |   20           20
  CUDA functionality                    |  253          253
  NVTX                                  |             No tests
  examples                              |    3     5      8
    example = hello_world.jl            |    1            1
    example = pairwise.jl               |          1      1
    example = peakflops.jl              |          1      1
    example = reduce/verify.jl          |          1      1
    example = scan.jl                   |          1      1
    example = vadd.jl                   |          1      1
    example = wmma/high-level.jl        |    1            1
    example = wmma/low-level.jl         |    1            1
ERROR: LoadError: Some tests did not pass: 522 passed, 5 failed, 0 errored, 0 broken.
in expression starting at /home/azamat/.julia/packages/CUDAnative/sivOw/test/runtests.jl:9
   Testing CuArrays
 Resolving package versions...
    Status `/tmp/jl_NFn4em/Manifest.toml`
  [621f4979] AbstractFFTs v0.5.0
  [79e6a3ab] Adapt v1.0.1
  [b99e7846] BinaryProvider v0.5.8
  [fa961155] CEnum v0.2.0
  [3895d2a7] CUDAapi v3.1.0
  [c5f51814] CUDAdrv v6.0.0
  [be33ccc6] CUDAnative v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [bbf7d656] CommonSubexpressions v0.2.0
  [e66e0078] CompilerSupportLibraries_jll v0.2.0+1
  [8f4d0f93] Conda v1.4.1
  [3a865a2d] CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [864edb3b] DataStructures v0.17.10
  [163ba53b] DiffResults v1.0.2
  [b552c78f] DiffRules v1.0.1
  [7a1cc6ca] FFTW v1.1.0
  [1a297f60] FillArrays v0.8.5
  [f6369f11] ForwardDiff v0.10.9
  [0c68f7d7] GPUArrays v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)
  [682c06a0] JSON v0.21.0
  [929cbde3] LLVM v1.3.3
  [1914dd2f] MacroTools v0.5.4
  [872c559c] NNlib v0.6.4
  [77ba4419] NaNMath v0.3.3
  [efe28fd5] OpenSpecFun_jll v0.5.3+2
  [bac558e1] OrderedCollections v1.1.0
  [69de0a69] Parsers v0.3.11
  [189a3867] Reexport v0.2.0
  [ae029012] Requires v1.0.1
  [276daf66] SpecialFunctions v0.10.0
  [90137ffa] StaticArrays v0.12.1
  [a759f4b9] TimerOutputs v0.5.3
  [81def892] VersionParsing v1.2.0
  [2a0f44e3] Base64  [`@stdlib/Base64`]
  [ade2ca70] Dates  [`@stdlib/Dates`]
  [8ba89e20] Distributed  [`@stdlib/Distributed`]
  [b77e0a4c] InteractiveUtils  [`@stdlib/InteractiveUtils`]
  [76f85450] LibGit2  [`@stdlib/LibGit2`]
  [8f399da3] Libdl  [`@stdlib/Libdl`]
  [37e2e46d] LinearAlgebra  [`@stdlib/LinearAlgebra`]
  [56ddb016] Logging  [`@stdlib/Logging`]
  [d6f4376e] Markdown  [`@stdlib/Markdown`]
  [a63ad114] Mmap  [`@stdlib/Mmap`]
  [44cfe95a] Pkg  [`@stdlib/Pkg`]
  [de0858da] Printf  [`@stdlib/Printf`]
  [3fa0cd96] REPL  [`@stdlib/REPL`]
  [9a3f8284] Random  [`@stdlib/Random`]
  [ea8e919c] SHA  [`@stdlib/SHA`]
  [9e88b42a] Serialization  [`@stdlib/Serialization`]
  [6462fe0b] Sockets  [`@stdlib/Sockets`]
  [2f01184e] SparseArrays  [`@stdlib/SparseArrays`]
  [10745b16] Statistics  [`@stdlib/Statistics`]
  [8dfed614] Test  [`@stdlib/Test`]
  [cf7118a7] UUIDs  [`@stdlib/UUIDs`]
  [4ec0a83e] Unicode  [`@stdlib/Unicode`]
[ Info: Testing using device GeForce GTX 1080 Ti (compute capability 6.1.0, 9.802 GiB available memory) on CUDA driver 10.2.0 and toolkit 10.2.89
[ Info: Testing CUDNN 7.6.5
β”Œ Warning: Not testing CUTENSOR
β”” @ Main ~/.julia/packages/CuArrays/ks5EI/test/tensor.jl:7
[ Info: Testing ForwardDiff integration
Test Summary: | Pass  Total
CuArrays      | 4612   4612
   Testing CuArrays tests passed 
ERROR: Package CUDAnative errored during testing
julia> versioninfo()
Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

I appreciate any opportunity to run CUDA code. :smiley:

[ Info: Testing using device GeForce RTX 2070 (compute capability 7.5.0, 7.128 GiB available memory) on CUDA driver 10.1.0 and toolkit 10.1.243
[ Info: Testing CUDNN 7.6.5
[ Info: Testing CUTENSOR 1.0.1
[ Info: Testing ForwardDiff integration
Test Summary: | Pass  Broken  Total
CuArrays      | 5862       1   5863
   Testing CuArrays tests passed 
ERROR: Package CUDAnative errored during testing

Missing something in the output?

[ Info: Testing using device GeForce RTX 2070 (compute capability 7.5.0, 7.128 GiB available memory) on CUDA driver 10.1.0 and toolkit 10.1.243
basic reflection: Error During Test at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/codegen.jl:109
  Test threw exception
  Expression: CUDAnative.code_sass(devnull, valid_kernel, Tuple{}) == nothing
  CUPTIError: user doesn't have sufficient privileges which are required to start the profiling session (code 35, CUPTI_ERROR_INSUFFICIENT_PRIVILEGES)
  Stacktrace:
   [1] throw_api_error(::CUDAnative.CUPTI.CUptiResult) at /home/oliver/.julia/packages/CUDAnative/sivOw/src/cupti/error.jl:117
   [2] macro expansion at /home/oliver/.julia/packages/CUDAnative/sivOw/src/cupti/error.jl:130 [inlined]
   [3] cuptiSubscribe at /home/oliver/.julia/packages/CUDAnative/sivOw/src/cupti/libcupti.jl:197 [inlined]
   [4] #code_sass#233(::Bool, ::typeof(CUDAnative.code_sass), ::Base.DevNull, ::CUDAnative.CompilerJob) at /home/oliver/.julia/packages/CUDAnative/sivOw/src/reflection.jl:151
   [5] #code_sass at ./none:0 [inlined]
   [6] #code_sass#232(::VersionNumber, ::Bool, ::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CUDAnative.code_sass), ::Base.DevNull, ::Any, ::Any) at /home/oliver/.julia/packages/CUDAnative/sivOw/src/reflection.jl:136
   [7] code_sass(::Base.DevNull, ::Any, ::Any) at /home/oliver/.julia/packages/CUDAnative/sivOw/src/reflection.jl:134
   [8] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/codegen.jl:109
   [9] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [10] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/codegen.jl:106
   [11] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [12] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/codegen.jl:105
   [13] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [14] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/codegen.jl:5
  
function name mangling: Error During Test at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/codegen.jl:113
  Got exception outside of a @test
  CUPTIError: user doesn't have sufficient privileges which are required to start the profiling session (code 35, CUPTI_ERROR_INSUFFICIENT_PRIVILEGES)
  Stacktrace:
   [1] throw_api_error(::CUDAnative.CUPTI.CUptiResult) at /home/oliver/.julia/packages/CUDAnative/sivOw/src/cupti/error.jl:117
   [2] macro expansion at /home/oliver/.julia/packages/CUDAnative/sivOw/src/cupti/error.jl:130 [inlined]
   [3] cuptiSubscribe at /home/oliver/.julia/packages/CUDAnative/sivOw/src/cupti/libcupti.jl:197 [inlined]
   [4] #code_sass#233(::Bool, ::typeof(CUDAnative.code_sass), ::Base.DevNull, ::CUDAnative.CompilerJob) at /home/oliver/.julia/packages/CUDAnative/sivOw/src/reflection.jl:151
   [5] #code_sass at ./none:0 [inlined]
   [6] #code_sass#232(::VersionNumber, ::Bool, ::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CUDAnative.code_sass), ::Base.DevNull, ::Any, ::Any) at /home/oliver/.julia/packages/CUDAnative/sivOw/src/reflection.jl:136
   [7] code_sass(::Base.DevNull, ::Any, ::Any) at /home/oliver/.julia/packages/CUDAnative/sivOw/src/reflection.jl:134
   [8] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/codegen.jl:121
   [9] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [10] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/codegen.jl:114
   [11] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [12] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/codegen.jl:105
   [13] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [14] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/codegen.jl:5
   [15] include at ./boot.jl:328 [inlined]
   [16] include_relative(::Module, ::String) at ./loading.jl:1105
   [17] include(::Module, ::String) at ./Base.jl:31
   [18] include(::String) at ./client.jl:424
   [19] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/runtests.jl:94
   [20] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [21] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/runtests.jl:11
   [22] include at ./boot.jl:328 [inlined]
   [23] include_relative(::Module, ::String) at ./loading.jl:1105
   [24] include(::Module, ::String) at ./Base.jl:31
   [25] include(::String) at ./client.jl:424
   [26] top-level scope at none:6
   [27] eval(::Module, ::Any) at ./boot.jl:330
   [28] exec_options(::Base.JLOptions) at ./client.jl:263
   [29] _start() at ./client.jl:460
  
reflection: Error During Test at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/execution.jl:51
  Got exception outside of a @test
  CUPTIError: CUPTI is unable to initialize its connection to the CUDA driver (code 15, CUPTI_ERROR_NOT_INITIALIZED)
  Stacktrace:
   [1] throw_api_error(::CUDAnative.CUPTI.CUptiResult) at /home/oliver/.julia/packages/CUDAnative/sivOw/src/cupti/error.jl:117
   [2] macro expansion at /home/oliver/.julia/packages/CUDAnative/sivOw/src/cupti/error.jl:130 [inlined]
   [3] cuptiSubscribe at /home/oliver/.julia/packages/CUDAnative/sivOw/src/cupti/libcupti.jl:197 [inlined]
   [4] #code_sass#233(::Bool, ::typeof(CUDAnative.code_sass), ::Base.DevNull, ::CUDAnative.CompilerJob) at /home/oliver/.julia/packages/CUDAnative/sivOw/src/reflection.jl:151
   [5] #code_sass at ./none:0 [inlined]
   [6] #code_sass#232(::VersionNumber, ::Bool, ::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CUDAnative.code_sass), ::Base.DevNull, ::Any, ::Any) at /home/oliver/.julia/packages/CUDAnative/sivOw/src/reflection.jl:136
   [7] code_sass(::Base.DevNull, ::Any, ::Any) at /home/oliver/.julia/packages/CUDAnative/sivOw/src/reflection.jl:134
   [8] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/execution.jl:57
   [9] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [10] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/execution.jl:52
   [11] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [12] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/execution.jl:9
   [13] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [14] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/device/execution.jl:5
   [15] include at ./boot.jl:328 [inlined]
   [16] include_relative(::Module, ::String) at ./loading.jl:1105
   [17] include(::Module, ::String) at ./Base.jl:31
   [18] include(::String) at ./client.jl:424
   [19] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/runtests.jl:95
   [20] top-level scope at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.3/Test/src/Test.jl:1107
   [21] top-level scope at /home/oliver/.julia/packages/CUDAnative/sivOw/test/runtests.jl:11
   [22] include at ./boot.jl:328 [inlined]
   [23] include_relative(::Module, ::String) at ./loading.jl:1105
   [24] include(::Module, ::String) at ./Base.jl:31
   [25] include(::String) at ./client.jl:424
   [26] top-level scope at none:6
   [27] eval(::Module, ::Any) at ./boot.jl:330
   [28] exec_options(::Base.JLOptions) at ./client.jl:263
   [29] _start() at ./client.jl:460
  
Test Summary:                           | Pass  Error  Total
CUDAnative                              |  516      3    519
  base interface                        |              No tests
  pointer                               |   20            20
  code generation                       |   93            93
  code generation (relying on a device) |    7      2      9
    LLVM                                |    5             5
    SASS                                |    2      2      4
      basic reflection                  |    1      1      2
      function name mangling            |    1      1      2
  execution                             |   67      1     68
    @cuda                               |    3      1      4
      low-level interface               |              No tests
      launch configuration              |              No tests
      compilation params                |    1             1
      reflection                        |           1      1
      shared memory                     |              No tests
      streams                           |              No tests
      external kernels                  |              No tests
      calling device function           |              No tests
    argument passing                    |   28            28
    exceptions                          |   17            17
    shmem divergence bug                |    7             7
    dynamic parallelism                 |   11            11
    cooperative groups                  |    1             1
  pointer                               |   41            41
  device arrays                         |   20            20
  CUDA functionality                    |  253           253
  NVTX                                  |              No tests
  examples                              |    8             8
ERROR: LoadError: Some tests did not pass: 516 passed, 0 failed, 3 errored, 0 broken.
in expression starting at /home/oliver/.julia/packages/CUDAnative/sivOw/test/runtests.jl:9

And someone who cares! :stuck_out_tongue:

Like others, CUDAnative is giving me trouble:

(@v1.4) pkg> test CUDAnative
    Testing CUDAnative
Status `/tmp/jl_FgGQaf/Manifest.toml`
  [621f4979] AbstractFFTs v0.5.0
  [79e6a3ab] Adapt v1.0.1
  [b99e7846] BinaryProvider v0.5.8
  [fa961155] CEnum v0.2.0
  [3895d2a7] CUDAapi v3.1.0
  [c5f51814] CUDAdrv v6.0.0
  [be33ccc6] CUDAnative v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [3a865a2d] CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [864edb3b] DataStructures v0.17.10
  [0c68f7d7] GPUArrays v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)
  [929cbde3] LLVM v1.3.3
  [1914dd2f] MacroTools v0.5.4
  [872c559c] NNlib v0.6.4
  [bac558e1] OrderedCollections v1.1.0
  [189a3867] Reexport v0.2.0
  [ae029012] Requires v1.0.1
  [a759f4b9] TimerOutputs v0.5.3
  [2a0f44e3] Base64 
  [ade2ca70] Dates 
  [8ba89e20] Distributed 
  [b77e0a4c] InteractiveUtils 
  [76f85450] LibGit2 
  [8f399da3] Libdl 
  [37e2e46d] LinearAlgebra 
  [56ddb016] Logging 
  [d6f4376e] Markdown 
  [44cfe95a] Pkg 
  [de0858da] Printf 
  [3fa0cd96] REPL 
  [9a3f8284] Random 
  [ea8e919c] SHA 
  [9e88b42a] Serialization 
  [6462fe0b] Sockets 
  [2f01184e] SparseArrays 
  [10745b16] Statistics 
  [8dfed614] Test 
  [cf7118a7] UUIDs 
  [4ec0a83e] Unicode 
[ Info: Testing using device GeForce RTX 2060 (compute capability 7.5.0, 4.613 GiB available memory) on CUDA driver 10.2.0 and toolkit 10.2.89
ERROR: LoadError: ArgumentError: Package BenchmarkTools not found in current path:
- Run `import Pkg; Pkg.add("BenchmarkTools")` to install the BenchmarkTools package.

Stacktrace:
 [1] require(::Module, ::Symbol) at ./loading.jl:892
 [2] include(::Module, ::String) at ./Base.jl:377
 [3] include_ifexists at ./client.jl:212 [inlined]
 [4] load_julia_startup() at ./client.jl:320
 [5] exec_options(::Base.JLOptions) at ./client.jl:259
 [6] _start() at ./client.jl:484
in expression starting at /home/mason/.julia/config/startup.jl:1
example = hello_world.jl: Test Failed at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
  Expression: rv
Stacktrace:
 [1] macro expansion at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34 [inlined]
 [2] macro expansion at /home/mason/julia/usr/share/julia/stdlib/v1.4/Test/src/Test.jl:1186 [inlined]
 [3] (::var"#741#744"{String})() at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:20
ERROR: LoadError: ArgumentError: Package BenchmarkTools not found in current path:
- Run `import Pkg; Pkg.add("BenchmarkTools")` to install the BenchmarkTools package.

Stacktrace:
 [1] require(::Module, ::Symbol) at ./loading.jl:892
 [2] include(::Module, ::String) at ./Base.jl:377
 [3] include_ifexists at ./client.jl:212 [inlined]
 [4] load_julia_startup() at ./client.jl:320
 [5] exec_options(::Base.JLOptions) at ./client.jl:259
 [6] _start() at ./client.jl:484
in expression starting at /home/mason/.julia/config/startup.jl:1
example = pairwise.jl: Test Failed at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
  Expression: rv
Stacktrace:
 [1] macro expansion at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34 [inlined]
 [2] macro expansion at /home/mason/julia/usr/share/julia/stdlib/v1.4/Test/src/Test.jl:1186 [inlined]
 [3] (::var"#741#744"{String})() at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:20
ERROR: LoadError: ArgumentError: Package BenchmarkTools not found in current path:
- Run `import Pkg; Pkg.add("BenchmarkTools")` to install the BenchmarkTools package.

Stacktrace:
 [1] require(::Module, ::Symbol) at ./loading.jl:892
 [2] include(::Module, ::String) at ./Base.jl:377
 [3] include_ifexists at ./client.jl:212 [inlined]
 [4] load_julia_startup() at ./client.jl:320
 [5] exec_options(::Base.JLOptions) at ./client.jl:259
 [6] _start() at ./client.jl:484
in expression starting at /home/mason/.julia/config/startup.jl:1
example = peakflops.jl: Test Failed at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
  Expression: rv
Stacktrace:
 [1] macro expansion at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34 [inlined]
 [2] macro expansion at /home/mason/julia/usr/share/julia/stdlib/v1.4/Test/src/Test.jl:1186 [inlined]
 [3] (::var"#741#744"{String})() at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:20
ERROR: LoadError: ArgumentError: Package BenchmarkTools not found in current path:
- Run `import Pkg; Pkg.add("BenchmarkTools")` to install the BenchmarkTools package.

Stacktrace:
 [1] require(::Module, ::Symbol) at ./loading.jl:892
 [2] include(::Module, ::String) at ./Base.jl:377
 [3] include_ifexists at ./client.jl:212 [inlined]
 [4] load_julia_startup() at ./client.jl:320
 [5] exec_options(::Base.JLOptions) at ./client.jl:259
 [6] _start() at ./client.jl:484
in expression starting at /home/mason/.julia/config/startup.jl:1
example = reduce/verify.jl: Test Failed at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
  Expression: rv
Stacktrace:
 [1] macro expansion at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34 [inlined]
 [2] macro expansion at /home/mason/julia/usr/share/julia/stdlib/v1.4/Test/src/Test.jl:1186 [inlined]
 [3] (::var"#741#744"{String})() at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:20
ERROR: LoadError: ArgumentError: Package BenchmarkTools not found in current path:
- Run `import Pkg; Pkg.add("BenchmarkTools")` to install the BenchmarkTools package.

Stacktrace:
 [1] require(::Module, ::Symbol) at ./loading.jl:892
 [2] include(::Module, ::String) at ./Base.jl:377
 [3] include_ifexists at ./client.jl:212 [inlined]
 [4] load_julia_startup() at ./client.jl:320
 [5] exec_options(::Base.JLOptions) at ./client.jl:259
 [6] _start() at ./client.jl:484
in expression starting at /home/mason/.julia/config/startup.jl:1
example = scan.jl: Test Failed at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
  Expression: rv
Stacktrace:
 [1] macro expansion at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34 [inlined]
 [2] macro expansion at /home/mason/julia/usr/share/julia/stdlib/v1.4/Test/src/Test.jl:1186 [inlined]
 [3] (::var"#741#744"{String})() at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:20
ERROR: LoadError: ArgumentError: Package BenchmarkTools not found in current path:
- Run `import Pkg; Pkg.add("BenchmarkTools")` to install the BenchmarkTools package.

Stacktrace:
 [1] require(::Module, ::Symbol) at ./loading.jl:892
 [2] include(::Module, ::String) at ./Base.jl:377
 [3] include_ifexists at ./client.jl:212 [inlined]
 [4] load_julia_startup() at ./client.jl:320
 [5] exec_options(::Base.JLOptions) at ./client.jl:259
 [6] _start() at ./client.jl:484
in expression starting at /home/mason/.julia/config/startup.jl:1
example = vadd.jl: Test Failed at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
  Expression: rv
Stacktrace:
 [1] macro expansion at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34 [inlined]
 [2] macro expansion at /home/mason/julia/usr/share/julia/stdlib/v1.4/Test/src/Test.jl:1186 [inlined]
 [3] (::var"#741#744"{String})() at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:20
ERROR: LoadError: ArgumentError: Package BenchmarkTools not found in current path:
- Run `import Pkg; Pkg.add("BenchmarkTools")` to install the BenchmarkTools package.

Stacktrace:
 [1] require(::Module, ::Symbol) at ./loading.jl:892
 [2] include(::Module, ::String) at ./Base.jl:377
 [3] include_ifexists at ./client.jl:212 [inlined]
 [4] load_julia_startup() at ./client.jl:320
 [5] exec_options(::Base.JLOptions) at ./client.jl:259
 [6] _start() at ./client.jl:484
in expression starting at /home/mason/.julia/config/startup.jl:1
example = wmma/high-level.jl: Test Failed at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
  Expression: rv
Stacktrace:
 [1] macro expansion at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34 [inlined]
 [2] macro expansion at /home/mason/julia/usr/share/julia/stdlib/v1.4/Test/src/Test.jl:1186 [inlined]
 [3] (::var"#741#744"{String})() at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:20
ERROR: LoadError: ArgumentError: Package BenchmarkTools not found in current path:
- Run `import Pkg; Pkg.add("BenchmarkTools")` to install the BenchmarkTools package.

Stacktrace:
 [1] require(::Module, ::Symbol) at ./loading.jl:892
 [2] include(::Module, ::String) at ./Base.jl:377
 [3] include_ifexists at ./client.jl:212 [inlined]
 [4] load_julia_startup() at ./client.jl:320
 [5] exec_options(::Base.JLOptions) at ./client.jl:259
 [6] _start() at ./client.jl:484
in expression starting at /home/mason/.julia/config/startup.jl:1
example = wmma/low-level.jl: Test Failed at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34
  Expression: rv
Stacktrace:
 [1] macro expansion at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:34 [inlined]
 [2] macro expansion at /home/mason/julia/usr/share/julia/stdlib/v1.4/Test/src/Test.jl:1186 [inlined]
 [3] (::var"#741#744"{String})() at /home/mason/.julia/packages/CUDAnative/sivOw/test/examples.jl:20
Test Summary:                           | Pass  Fail  Total
CUDAnative                              |  725     8    733
  base interface                        |             No tests
  pointer                               |   20           20
  code generation                       |   93           93
  code generation (relying on a device) |    8            8
  execution                             |   77           77
  pointer                               |   41           41
  device arrays                         |   20           20
  CUDA functionality                    |  253          253
  WMMA                                  |  206          206
  NVTX                                  |             No tests
  examples                              |          8      8
    example = hello_world.jl            |          1      1
    example = pairwise.jl               |          1      1
    example = peakflops.jl              |          1      1
    example = reduce/verify.jl          |          1      1
    example = scan.jl                   |          1      1
    example = vadd.jl                   |          1      1
    example = wmma/high-level.jl        |          1      1
    example = wmma/low-level.jl         |          1      1
ERROR: LoadError: Some tests did not pass: 725 passed, 8 failed, 0 errored, 0 broken.
in expression starting at /home/mason/.julia/packages/CUDAnative/sivOw/test/runtests.jl:9
ERROR: Package CUDAnative errored during testing

but CuArrays passes beautifully

    Testing CuArrays
Status `/tmp/jl_ucCRlc/Manifest.toml`
  [621f4979] AbstractFFTs v0.5.0
  [79e6a3ab] Adapt v1.0.1
  [b99e7846] BinaryProvider v0.5.8
  [fa961155] CEnum v0.2.0
  [3895d2a7] CUDAapi v3.1.0
  [c5f51814] CUDAdrv v6.0.0
  [be33ccc6] CUDAnative v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)
  [bbf7d656] CommonSubexpressions v0.2.0
  [e66e0078] CompilerSupportLibraries_jll v0.2.0+1
  [3a865a2d] CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [864edb3b] DataStructures v0.17.10
  [163ba53b] DiffResults v1.0.2
  [b552c78f] DiffRules v1.0.1
  [7a1cc6ca] FFTW v1.2.0
  [f5851436] FFTW_jll v3.3.9+4
  [1a297f60] FillArrays v0.8.5
  [f6369f11] ForwardDiff v0.10.9
  [0c68f7d7] GPUArrays v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)
  [1d5cc7b8] IntelOpenMP_jll v2018.0.3+0
  [929cbde3] LLVM v1.3.3
  [856f044c] MKL_jll v2019.0.117+2
  [1914dd2f] MacroTools v0.5.4
  [872c559c] NNlib v0.6.4
  [77ba4419] NaNMath v0.3.3
  [efe28fd5] OpenSpecFun_jll v0.5.3+2
  [bac558e1] OrderedCollections v1.1.0
  [189a3867] Reexport v0.2.0
  [ae029012] Requires v1.0.1
  [276daf66] SpecialFunctions v0.10.0
  [90137ffa] StaticArrays v0.12.1
  [a759f4b9] TimerOutputs v0.5.3
  [2a0f44e3] Base64 
  [ade2ca70] Dates 
  [8ba89e20] Distributed 
  [b77e0a4c] InteractiveUtils 
  [76f85450] LibGit2 
  [8f399da3] Libdl 
  [37e2e46d] LinearAlgebra 
  [56ddb016] Logging 
  [d6f4376e] Markdown 
  [44cfe95a] Pkg 
  [de0858da] Printf 
  [3fa0cd96] REPL 
  [9a3f8284] Random 
  [ea8e919c] SHA 
  [9e88b42a] Serialization 
  [6462fe0b] Sockets 
  [2f01184e] SparseArrays 
  [10745b16] Statistics 
  [8dfed614] Test 
  [cf7118a7] UUIDs 
  [4ec0a83e] Unicode 
[ Info: Testing using device GeForce RTX 2060 (compute capability 7.5.0, 4.619 GiB available memory) on CUDA driver 10.2.0 and toolkit 10.2.89
[ Info: Testing CUDNN 7.6.5
[ Info: Testing CUTENSOR 1.0.1
[ Info: Testing ForwardDiff integration
Test Summary: | Pass  Broken  Total
CuArrays      | 5862       1   5863
    Testing CuArrays tests passed 

using the change I made in this PR, I see all tests pass

(@v1.4) pkg> test CUDAnative
    Testing CUDAnative
Status `/tmp/jl_DrSkkr/Manifest.toml`
 ...
  [3a865a2d] CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)
  [0c68f7d7] GPUArrays v2.0.1 #master (https://github.com/JuliaGPU/GPUArrays.jl.git)
...
[ Info: Testing using device GeForce RTX 2060 (compute capability 7.5.0, 4.587 GiB available memory) on CUDA driver 10.2.0 and toolkit 10.2.89
[ Info: Building the CUDAnative run-time library for your sm_75 device, this might take a while...
Test Summary: | Pass  Total
CUDAnative    |  733    733
    Testing CUDAnative tests passed 
2 Likes

That error is because you don’t have the permissions to profile on your system, you need to set NVreg_RestrictProfilingToAdminUsers=0. Our error message for that needs to improve though :slight_smile:

1 Like

Is this thread an okay place to report performance regressions with the BinaryBuilder version? Maybe this should just be an issue instead?

Here’s an example for me on the master branch with CUDA from BinaryBuilder:

julia> using BenchmarkTools, CuArrays

julia> function pi_mc_cu(nsamples)
           xs = CuArrays.rand(nsamples); ys = CuArrays.rand(nsamples)
           mapreduce((x, y) -> (x^2 + y^2) < 1.0, +, xs, ys, init=0) * 4/nsamples
       end
pi_mc_cu (generic function with 1 method)

julia> @benchmark pi_mc_cu(10000000)
BenchmarkTools.Trial: 
  memory estimate:  16.63 KiB
  allocs estimate:  473
  --------------
  minimum time:     1.620 ms (0.00% GC)
  median time:      1.666 ms (0.00% GC)
  mean time:        1.709 ms (1.60% GC)
  maximum time:     9.460 ms (7.77% GC)
  --------------
  samples:          2921
  evals/sample:     1

(@v1.4) pkg> st CuArrays 
Status `~/.julia/environments/v1.4/Project.toml`
  [3a865a2d] CuArrays v1.7.0 #master (https://github.com/JuliaGPU/CuArrays.jl.git)

(@v1.4) pkg> st CUDAnative
Status `~/.julia/environments/v1.4/Project.toml`
  [be33ccc6] CUDAnative v2.10.2 #master (https://github.com/JuliaGPU/CUDAnative.jl.git)

and here’s that same example on the latest tagged version:

julia> using BenchmarkTools, CuArrays
[ Info: Precompiling CuArrays [3a865a2d-5b23-5a0f-bc46-62713ec82fae]
β”Œ Warning: Incompatibility detected between CUDA and LLVM 8.0+; disabling debug info emission for CUDA kernels
β”” @ CUDAnative ~/.julia/packages/CUDAnative/hfulr/src/CUDAnative.jl:114
β”Œ Warning: You are using CUDNN 7.6.5 for CUDA 10.1.0 with CUDA toolkit 10.2.89; these might be incompatible.
β”” @ CuArrays ~/.julia/packages/CuArrays/HE8G6/src/CuArrays.jl:127

julia> function pi_mc_cu(nsamples)
           xs = CuArrays.rand(nsamples); ys = CuArrays.rand(nsamples)
           mapreduce((x, y) -> (x^2 + y^2) < 1.0, +, xs, ys, init=0) * 4/nsamples
       end
pi_mc_cu (generic function with 1 method)

julia> @benchmark pi_mc_cu(10000000)
BenchmarkTools.Trial: 
  memory estimate:  4.61 KiB
  allocs estimate:  126
  --------------
  minimum time:     594.302 ΞΌs (0.00% GC)
  median time:      659.321 ΞΌs (0.00% GC)
  mean time:        667.914 ΞΌs (1.58% GC)
  maximum time:     2.338 ms (39.61% GC)
  --------------
  samples:          7463
  evals/sample:     1

(@v1.4) pkg> st CuArrays
Status `~/.julia/environments/v1.4/Project.toml`
  [3a865a2d] CuArrays v1.7.2

(@v1.4) pkg> st CUDAnative
Status `~/.julia/environments/v1.4/Project.toml`
  [be33ccc6] CUDAnative v2.10.2

As you can see, I lost around a factor of 3 performance on the new master.


julia> versioninfo()
Julia Version 1.4.0-rc1.0
Commit b0c33b0cf5* (2020-01-23 17:23 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: AMD Ryzen 5 2600 Six-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, znver1)
Environment:
  JULIA_NUM_THREADS = 6

[mason@mason-pc ~]$ sudo pacman -Q --info cuda
Name            : cuda
Version         : 10.2.89-3
Description     : NVIDIA's GPU programming toolkit
Architecture    : x86_64
URL             : https://developer.nvidia.com/cuda-zone
Licenses        : custom:NVIDIA
Groups          : None
Provides        : cuda-toolkit  cuda-sdk
Depends On      : gcc8-libs  gcc8  opencl-nvidia  nvidia-utils
Optional Deps   : gdb: for cuda-gdb
                  java-runtime=8: for nsight and nvvp
Required By     : cudnn
Optional For    : None
Conflicts With  : None
Replaces        : cuda-toolkit  cuda-sdk
Installed Size  : 4.04 GiB
Packager        : Sven-Hendrik Haase <svenstaro@gmail.com>
Build Date      : Tue 31 Dec 2019 01:07:53 AM MST
Install Date    : Wed 26 Feb 2020 03:04:42 PM MST
Install Reason  : Explicitly installed
Install Script  : Yes
Validated By    : Signature

[mason@mason-pc ~]$ lspci  -v -s  $(lspci | grep ' VGA ' | cut -d" " -f 1)
1f:00.0 VGA compatible controller: NVIDIA Corporation TU106 [GeForce RTX 2060 Rev. A] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: ZOTAC International (MCO) Ltd. TU106 [GeForce RTX 2060 Rev. A]
        Flags: bus master, fast devsel, latency 0, IRQ 71
        Memory at f6000000 (32-bit, non-prefetchable) [size=16M]
        Memory at e0000000 (64-bit, prefetchable) [size=256M]
        Memory at f0000000 (64-bit, prefetchable) [size=32M]
        I/O ports at e000 [size=128]
        [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: <access denied>
        Kernel driver in use: nvidia
        Kernel modules: nouveau, nvidia_drm, nvidia

Please open an issue for that. mapreduce got reworked, with significant speed-ups where I tested it, but apparently some slowdowns too.

1 Like

Ah of course, I should have tested without the artifacts, you’re right, it’s not the artifacts that cause the problem:

[mason@mason-pc ~]$ JULIA_CUDA_USE_BINARYBUILDER=false julia -q
julia> using CuArrays
β”Œ Warning: You are using CUDNN 7.6.5 for CUDA 10.1.0 with CUDA toolkit 10.2.89; these might be incompatible.
β”” @ CuArrays ~/.julia/packages/CuArrays/ks5EI/src/bindeps.jl:234

julia> function pi_mc_cu(nsamples)
                  xs = CuArrays.rand(nsamples); ys = CuArrays.rand(nsamples)
                  mapreduce((x, y) -> (x^2 + y^2) < 1.0, +, xs, ys, init=0) * 4/nsamples
              end
pi_mc_cu (generic function with 1 method)

julia> @benchmark pi_mc_cu(10000000)
BenchmarkTools.Trial: 
  memory estimate:  16.63 KiB
  allocs estimate:  473
  --------------
  minimum time:     1.630 ms (0.00% GC)
  median time:      1.673 ms (0.00% GC)
  mean time:        1.770 ms (1.84% GC)
  maximum time:     3.905 ms (32.20% GC)
  --------------
  samples:          2818
  evals/sample:     1

Opening an issue now.