CUDA test random failure

Hi, I am new to Julia gpu computation. I recently installed CUDA.jl, with the following status:

(julia) pkg> st
Status `~/julia/Project.toml`
  [6e4b80f9] BenchmarkTools v1.3.2
  [052768ef] CUDA v3.12.0
  [7a1cc6ca] FFTW v1.5.0
  [f67ccb44] HDF5 v0.16.13
  [033835bb] JLD2 v0.4.29
  [91a5bcdd] Plots v1.38.0

Upon testing with ]test CUDA I received the following info along with some failures of various errors. Following suggestions by the test message, I used --threads n with different n’s, but regardless I randomly got some worker # terminated message. Can someone help me with these errors? So far I tried several times and have not been able to finish the test even once. Part of the screen output is shown below, with all relevant version info. Thanks in advance.

...
  [13072b0f] AxisAlgorithms v1.0.1
⌅ [ab4f0b2a] BFloat16s v0.2.0
  [fa961155] CEnum v0.4.2
...
 [46192b85] GPUArraysCore v0.1.2
⌅ [61eb1bfa] GPUCompiler v0.16.7
  [a98d9a8b] Interpolations v0.14.7
...
        Info Packages marked with ⌅ have new versions available but cannot be upgraded.
     Testing Running tests...
┌ Info: System information:
│ CUDA toolkit 11.7, artifact installation
│ NVIDIA driver 527.56.0, for CUDA 12.0
│ CUDA driver 12.0
│
│ Libraries:
│ - CUBLAS: 11.10.1
│ - CURAND: 10.2.10
│ - CUFFT: 10.7.2
│ - CUSOLVER: 11.3.5
│ - CUSPARSE: 11.7.3
│ - CUPTI: 17.0.0
│ - NVML: 12.0.0+525.65
│ - CUDNN: 8.30.2 (for CUDA 11.5.0)
│ - CUTENSOR: 1.4.0 (for CUDA 11.5.0)
│
│ Toolchain:
│ - Julia: 1.8.1
│ - LLVM: 13.0.1
│ - PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2
│ - Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86
│
│ 1 device:
└   0: NVIDIA GeForce RTX 3060 Laptop GPU (sm_86, 4.956 GiB / 6.000 GiB available)
[ Info: Testing using 1 device(s): 0. NVIDIA GeForce RTX 3060 Laptop GPU (UUID f780bc3c-a9f0-9c3d-e723-c01de3450fd7)
...
                                                  |          | ---------------- GPU ---------------- | ---------------- CPU ---------------- |
Test                                     (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB) | GC (s) | GC % | Alloc (MB) | RSS (MB) |
initialization                                (2) |     6.50 |   0.00 |  0.0 |       0.00 |      N/A |   0.02 |  0.3 |     122.10 |   891.5
...
Worker 5 terminated.
gpuarrays/linalg/mul!/matrix-matrix           (5) |         failed at 2023-01-01T18:51:32.221
Unhandled Task ERROR: EOFError: read end of file
...
Stacktrace:
 [1] (::Base.var"#wait_locked#680")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base ./stream.jl:941
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base ./stream.jl:950
 [3] unsafe_read
   @ ./io.jl:759 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base ./io.jl:758
 [5] read!
   @ ./io.jl:760 [inlined]
 [6] deserialize_hdr_raw
   @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:172
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:133
 [9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed ./task.jl:484
...
Worker 4 terminated.
gpuarrays/linalg/norm                         (4) |         failed at 2023-01-01T18:57:34.022
Unhandled Task ERROR: EOFError: read end of file
Stacktrace:
 [1] (::Base.var"#wait_locked#680")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base ./stream.jl:941
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base ./stream.jl:950
 [3] unsafe_read
   @ ./io.jl:759 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base ./io.jl:758
 [5] read!
   @ ./io.jl:760 [inlined]
 [6] deserialize_hdr_raw
   @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:172
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:133
 [9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed ./task.jl:484
gpuarrays/base                                (3) |   269.04 |   0.07 |  0.0 |       8.90 |      N/A |   2.51 |  0.9 |    5865.70 |  2267.16 |
...
broadcast                                     (7) |    23.83 |   0.00 |  0.0 |       0.00 |      N/A |   1.47 |  6.2 |    3407.76 |  1668.28 |
codegen                                       (7) |         failed at 2023-01-01T19:06:18.211
array                                         (8) |   137.78 |   0.14 |  0.1 |    1261.61 |      N/A |  11.54 |  8.4 |   24142.80 |  2255.11 |
      From worker 8:
      From worker 8:    signal (11): Segmentation fault
...
      From worker 8:    unknown function (ip: 0x7f86dc6406e5)
      From worker 8:    unknown function (ip: 0x7f8669b71768)
      From worker 8:    unknown function (ip: 0x7f8668f62343)
...
Worker 8 terminated.
cudadrv                                       (8) |         failed at 2023-01-01T19:06:36.235
Unhandled Task ERROR: EOFError: read end of file
Stacktrace:
 [1] (::Base.var"#wait_locked#680")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base ./stream.jl:941
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base ./stream.jl:950
 [3] unsafe_read
   @ ./io.jl:759 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base ./io.jl:758
 [5] read!
   @ ./io.jl:760 [inlined]
 [6] deserialize_hdr_raw
   @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:172
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:133
 [9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed ./task.jl:484
Worker 3 terminated.
gpuarrays/broadcasting                        (3) |         failed at 2023-01-01T19:16:14.934
Unhandled Task ERROR: EOFError: read end of file
...

This is the crucial part, but you left out the actual interesting part of the stack trace. Could you post the other lines? Does it point to a specific location?

This is a very new version, and CUDA.jl may not be fully compatible yet. Try the master branch, there is at least one segfault I fixed: Change invalid code test for CUDA 12 compatibility. · JuliaGPU/CUDA.jl@c40daf8 · GitHub. If this is the error you were running into, know that it doesn’t affect any realistic use case, so you should be fine just using CUDA.jl in your environment.

The output is too long to copy by hand, so I re-tested with the master branch as suggested and redirected outputs to out.log as follows

julia --threads 4 test.jl &> out.log

where contents of test.jl is as follows

using Pkg
Pkg.activate(".")
Pkg.status()
Pkg.test("CUDA")

Unfortunately there are still plenty of errors, and color coding of the screen outputs is lost. Half of the output is attached. Apologies for the length.
Are these errors not important? If they are, is there a way to “retrofit” the CUDA 12.0 toolkit back to a compatible version? So far my play with the CUDA package seems to work. I’m using wsl2 Ubuntu 20.04.5 LTS

    Activating project at `~/julia`
     Testing CUDA
      Status `/tmp/jl_bUo2zf/Project.toml`
  [79e6a3ab] Adapt v3.4.0
  [ab4f0b2a] BFloat16s v0.4.2
  [052768ef] CUDA v4.0.0 `https://github.com/JuliaGPU/CUDA.jl.git#master`
  [864edb3b] DataStructures v0.18.13
  [7a1cc6ca] FFTW v1.5.0
  [0c68f7d7] GPUArrays v8.5.0
  [a98d9a8b] Interpolations v0.14.7
  [872c559c] NNlib v0.8.13
  [276daf66] SpecialFunctions v2.1.7
  [a759f4b9] TimerOutputs v0.5.22
  [76a88914] CUDA_Runtime_jll v0.2.3+2
  [ade2ca70] Dates `@stdlib/Dates`
  [8ba89e20] Distributed `@stdlib/Distributed`
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`
  [de0858da] Printf `@stdlib/Printf`
  [3fa0cd96] REPL `@stdlib/REPL`
  [9a3f8284] Random `@stdlib/Random`
  [2f01184e] SparseArrays `@stdlib/SparseArrays`
  [10745b16] Statistics `@stdlib/Statistics`
  [8dfed614] Test `@stdlib/Test`
      Status `/tmp/jl_bUo2zf/Manifest.toml`
  [621f4979] AbstractFFTs v1.2.1
  [79e6a3ab] Adapt v3.4.0
  [13072b0f] AxisAlgorithms v1.0.1
  [ab4f0b2a] BFloat16s v0.4.2
  [fa961155] CEnum v0.4.2
  [052768ef] CUDA v4.0.0 `https://github.com/JuliaGPU/CUDA.jl.git#master`
  [1af6417a] CUDA_Runtime_Discovery v0.1.1
  [d360d2e6] ChainRulesCore v1.15.6
  [9e997f8a] ChangesOfVariables v0.1.4
  [34da2185] Compat v4.5.0
  [864edb3b] DataStructures v0.18.13
  [ffbed154] DocStringExtensions v0.9.3
  [e2ba6199] ExprTools v0.1.8
  [7a1cc6ca] FFTW v1.5.0
  [0c68f7d7] GPUArrays v8.5.0
  [46192b85] GPUArraysCore v0.1.2
  [61eb1bfa] GPUCompiler v0.17.0
  [a98d9a8b] Interpolations v0.14.7
  [3587e190] InverseFunctions v0.1.8
  [92d709cd] IrrationalConstants v0.1.1
  [692b3bcd] JLLWrappers v1.4.1
  [929cbde3] LLVM v4.14.1
  [2ab3a3ac] LogExpFunctions v0.3.19
  [872c559c] NNlib v0.8.13
  [6fe1bfb0] OffsetArrays v1.12.8
  [bac558e1] OrderedCollections v1.4.1
  [21216c6a] Preferences v1.3.0
  [74087812] Random123 v1.6.0
  [e6cf234a] RandomNumbers v1.5.3
  [c84ed2f1] Ratios v0.4.3
  [189a3867] Reexport v1.2.2
  [ae029012] Requires v1.3.0
  [276daf66] SpecialFunctions v2.1.7
  [90137ffa] StaticArrays v1.5.12
  [1e83bf80] StaticArraysCore v1.4.0
  [a759f4b9] TimerOutputs v0.5.22
  [efce3f68] WoodburyMatrices v0.5.5
  [4ee394cb] CUDA_Driver_jll v0.2.0+0
  [76a88914] CUDA_Runtime_jll v0.2.3+2
  [f5851436] FFTW_jll v3.3.10+0
  [1d5cc7b8] IntelOpenMP_jll v2018.0.3+2
  [dad2f222] LLVMExtra_jll v0.0.16+0
  [856f044c] MKL_jll v2022.2.0+0
  [efe28fd5] OpenSpecFun_jll v0.5.5+0
  [0dad84c5] ArgTools v1.1.1 `@stdlib/ArgTools`
  [56f22d72] Artifacts `@stdlib/Artifacts`
  [2a0f44e3] Base64 `@stdlib/Base64`
  [ade2ca70] Dates `@stdlib/Dates`
  [8ba89e20] Distributed `@stdlib/Distributed`
  [f43a241f] Downloads v1.6.0 `@stdlib/Downloads`
  [7b1f6079] FileWatching `@stdlib/FileWatching`
  [b77e0a4c] InteractiveUtils `@stdlib/InteractiveUtils`
  [4af54fe1] LazyArtifacts `@stdlib/LazyArtifacts`
  [b27032c2] LibCURL v0.6.3 `@stdlib/LibCURL`
  [76f85450] LibGit2 `@stdlib/LibGit2`
  [8f399da3] Libdl `@stdlib/Libdl`
  [37e2e46d] LinearAlgebra `@stdlib/LinearAlgebra`
  [56ddb016] Logging `@stdlib/Logging`
  [d6f4376e] Markdown `@stdlib/Markdown`
  [a63ad114] Mmap `@stdlib/Mmap`
  [ca575930] NetworkOptions v1.2.0 `@stdlib/NetworkOptions`
  [44cfe95a] Pkg v1.8.0 `@stdlib/Pkg`
  [de0858da] Printf `@stdlib/Printf`
  [3fa0cd96] REPL `@stdlib/REPL`
  [9a3f8284] Random `@stdlib/Random`
  [ea8e919c] SHA v0.7.0 `@stdlib/SHA`
  [9e88b42a] Serialization `@stdlib/Serialization`
  [1a1011a3] SharedArrays `@stdlib/SharedArrays`
  [6462fe0b] Sockets `@stdlib/Sockets`
  [2f01184e] SparseArrays `@stdlib/SparseArrays`
  [10745b16] Statistics `@stdlib/Statistics`
  [fa267f1f] TOML v1.0.0 `@stdlib/TOML`
  [a4e569a6] Tar v1.10.0 `@stdlib/Tar`
  [8dfed614] Test `@stdlib/Test`
  [cf7118a7] UUIDs `@stdlib/UUIDs`
  [4ec0a83e] Unicode `@stdlib/Unicode`
  [e66e0078] CompilerSupportLibraries_jll v0.5.2+0 `@stdlib/CompilerSupportLibraries_jll`
  [deac9b47] LibCURL_jll v7.84.0+0 `@stdlib/LibCURL_jll`
  [29816b5a] LibSSH2_jll v1.10.2+0 `@stdlib/LibSSH2_jll`
  [c8ffd9c3] MbedTLS_jll v2.28.0+0 `@stdlib/MbedTLS_jll`
  [14a3606d] MozillaCACerts_jll v2022.2.1 `@stdlib/MozillaCACerts_jll`
  [4536629a] OpenBLAS_jll v0.3.20+0 `@stdlib/OpenBLAS_jll`
  [05823500] OpenLibm_jll v0.8.1+0 `@stdlib/OpenLibm_jll`
  [83775a58] Zlib_jll v1.2.12+3 `@stdlib/Zlib_jll`
  [8e850b90] libblastrampoline_jll v5.1.1+0 `@stdlib/libblastrampoline_jll`
  [8e850ede] nghttp2_jll v1.48.0+0 `@stdlib/nghttp2_jll`
  [3f19e933] p7zip_jll v17.4.0+0 `@stdlib/p7zip_jll`
     Testing Running tests...
                                                  |          | ---------------- GPU ---------------- | ---------------- CPU ---------------- |
Test                                     (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB) | GC (s) | GC % | Alloc (MB) | RSS (MB) |
initialization                                (2) |     9.97 |   0.00 |  0.0 |       0.00 |      N/A |   0.01 |  0.1 |      97.03 |  1209.04 |
gpuarrays/indexing scalar                     (3) |    34.00 |   0.04 |  0.1 |       0.01 |      N/A |   1.44 |  4.2 |    4906.21 |  1212.18 |
gpuarrays/math/power                          (2) |    79.78 |   0.00 |  0.0 |       0.01 |      N/A |   7.10 |  8.9 |   14302.66 |  2061.98 |
gpuarrays/linalg/mul!/vector-matrix           (3) |    92.43 |   0.01 |  0.0 |       0.02 |      N/A |   5.47 |  5.9 |   14498.98 |  2192.13 |
gpuarrays/interface                           (3) |     5.51 |   0.00 |  0.0 |       0.00 |      N/A |   0.32 |  5.8 |     761.85 |  2192.13 |
gpuarrays/indexing multidimensional           (2) |    49.48 |   0.00 |  0.0 |       1.21 |      N/A |   2.77 |  5.6 |    7719.91 |  2061.98 |
gpuarrays/linalg                              (5) |   143.29 |   0.07 |  0.0 |      11.72 |      N/A |   7.63 |  5.3 |   22060.00 |  2457.75 |
gpuarrays/reductions/reducedim!               (4) |   145.92 |   0.04 |  0.0 |       1.03 |      N/A |  13.05 |  8.9 |   26430.49 |  1226.55 |
gpuarrays/uniformscaling                      (5) |    12.69 |   0.00 |  0.0 |       0.01 |      N/A |   0.56 |  4.4 |    1509.86 |  2457.75 |
gpuarrays/reductions/any all count            (3) |    26.58 |   0.00 |  0.0 |       0.00 |      N/A |   2.73 | 10.3 |    5366.74 |  2192.13 |
gpuarrays/math/intrinsics                     (5) |     4.75 |   0.00 |  0.0 |       0.00 |      N/A |   0.25 |  5.3 |     665.09 |  2457.75 |
gpuarrays/statistics                          (5) |         failed at 2023-01-04T02:03:35.980
gpuarrays/linalg/mul!/matrix-matrix           (4) |   809.35 |   1.01 |  0.1 |       0.12 |      N/A |  11.21 |  1.4 |   23473.99 |  1895.87 |
gpuarrays/constructors                        (4) |    46.32 |   0.01 |  0.0 |       0.08 |      N/A |   2.35 |  5.1 |    6033.94 |  2382.25 |
gpuarrays/random                              (4) |    32.03 |   0.00 |  0.0 |       0.03 |      N/A |   1.90 |  5.9 |    4588.74 |  2382.25 |
gpuarrays/linalg/norm                         (3) |         failed at 2023-01-04T02:08:00.870
gpuarrays/base                                (4) |   124.02 |   1.99 |  1.6 |       8.90 |      N/A |   3.17 |  2.6 |    6723.08 |  2382.25 |
gpuarrays/reductions/minimum maximum extrema  (2) |  1082.80 |   0.62 |  0.1 |       2.19 |      N/A |  34.26 |  3.2 |   61498.86 |  2202.61 |
gpuarrays/reductions/mapreduce                (6) |   337.42 |   0.23 |  0.1 |       1.81 |      N/A |  20.98 |  6.2 |   44044.26 |  1314.33 |
gpuarrays/reductions/== isequal               (7) |   115.35 |   0.09 |  0.1 |       1.07 |      N/A |   9.27 |  8.0 |   20251.40 |  1210.26 |
gpuarrays/reductions/reduce                   (6) |    40.15 |   0.01 |  0.0 |       1.21 |      N/A |   1.10 |  2.8 |    2833.10 |  1516.86 |
apiutils                                      (6) |     0.15 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |       0.80 |  1516.88 |
gpuarrays/reductions/mapreducedim!            (2) |         failed at 2023-01-04T02:15:43.180
broadcast                                     (8) |    37.63 |   0.04 |  0.1 |       0.00 |      N/A |   2.38 |  6.3 |    5834.69 |  1203.00 |
codegen                                       (8) |         failed at 2023-01-04T02:16:47.645
gpuarrays/broadcasting                        (4) |         failed at 2023-01-04T02:29:47.941
array                                         (6) |  1191.20 |   0.84 |  0.1 |    1264.82 |      N/A |  12.89 |  1.1 |   22975.35 |  3057.52 |
cudadrv                                      (10) |    20.93 |   0.09 |  0.4 |       0.00 |      N/A |   0.68 |  3.3 |    2345.37 |  1204.49 |
curand                                       (10) |     0.46 |   0.00 |  0.0 |       0.00 |      N/A |   0.01 |  2.0 |      35.80 |  1204.49 |
cufft                                         (6) |    40.04 |   0.01 |  0.0 |     205.31 |      N/A |   2.59 |  6.5 |    4279.44 |  3136.33 |
examples                                      (6) |         failed at 2023-01-04T02:31:34.058
cusparse                                     (10) |    97.95 |   0.12 |  0.1 |       9.47 |      N/A |   5.22 |  5.3 |   10453.82 |  1967.34 |
      From worker 10:	WARNING: Method definition #5122#kernel(Any) in module Main at /home/sunjin/.julia/packages/CUDA/ifpPv/test/execution.jl:315 overwritten at /home/sunjin/.julia/packages/CUDA/ifpPv/test/execution.jl:323.
cublas                                        (9) |         failed at 2023-01-04T02:32:57.914
execution                                    (10) |         failed at 2023-01-04T02:33:11.848
gpuarrays/reductions/sum prod                 (7) |  1378.23 |  48.74 |  3.5 |       3.24 |      N/A |  30.53 |  2.2 |   53438.22 |  2154.23 |
iterator                                     (12) |     4.25 |   0.04 |  1.0 |       1.93 |      N/A |   0.19 |  4.4 |     664.93 |  1201.90 |
nvtx                                         (12) |     0.22 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |      22.61 |  1201.90 |
pointer                                      (12) |     0.29 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |      13.19 |  1201.90 |
exceptions                                   (11) |    86.29 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |      34.57 |  1202.88 |
pool                                         (12) |     1.43 |   0.00 |  0.0 |       0.00 |      N/A |   0.30 | 21.0 |     204.39 |  1201.90 |
nvml                                         (13) |     0.54 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |      59.84 |  1198.10 |
linalg                                        (7) |    46.79 |   0.00 |  0.0 |       9.03 |      N/A |   3.55 |  7.6 |    7630.12 |  3462.21 |
random                                       (11) |    45.98 |   0.05 |  0.1 |     256.58 |      N/A |   2.70 |  5.9 |    6763.16 |  1202.88 |
utils                                        (11) |     0.86 |   0.00 |  0.0 |       0.00 |      N/A |   0.02 |  2.0 |      85.98 |  1202.88 |
threading                                     (7) |     5.63 |   0.00 |  0.1 |      10.94 |      N/A |   0.44 |  7.9 |     564.09 |  3462.27 |
cusolver/multigpu                             (7) |    18.82 |   0.00 |  0.0 |     545.89 |      N/A |   0.82 |  4.3 |    1914.40 |  3491.86 |
texture                                      (13) |    65.86 |   0.04 |  0.1 |       0.09 |      N/A |   4.19 |  6.4 |   10976.77 |  1198.10 |
cusolver/sparse                               (7) |         failed at 2023-01-04T02:34:38.172
cusparse/conversions                         (14) |    31.69 |   0.12 |  0.4 |       1.43 |      N/A |   1.90 |  6.0 |    4384.70 |  1204.48 |
cusparse/device                              (14) |     0.33 |   0.00 |  0.1 |       0.01 |      N/A |   0.01 |  2.9 |       7.49 |  1204.48 |
cusparse/broadcast                           (13) |    60.58 |   0.01 |  0.0 |       0.02 |      N/A |   3.78 |  6.2 |    9708.01 |  1237.14 |
sorting                                      (12) |   171.13 |   0.01 |  0.0 |     543.84 |      N/A |   8.18 |  4.8 |   28184.76 |  3912.35 |
cusparse/generic                             (14) |    55.49 |   0.05 |  0.1 |       6.52 |      N/A |   2.64 |  4.7 |    6108.44 |  1211.70 |
device/array                                 (14) |     8.73 |   0.00 |  0.0 |       0.00 |      N/A |   0.53 |  6.1 |    1371.40 |  1211.71 |
cusparse/interfaces                          (13) |    87.57 |   0.09 |  0.1 |      10.75 |      N/A |   3.85 |  4.4 |    9440.00 |  1420.38 |
device/intrinsics                            (14) |    39.17 |   0.00 |  0.0 |       0.00 |      N/A |   2.42 |  6.2 |    6385.59 |  1238.05 |
device/ldg                                   (13) |    10.24 |   0.00 |  0.0 |       0.00 |      N/A |   0.68 |  6.6 |    1569.36 |  1420.38 |
cusparse/linalg                              (12) |         failed at 2023-01-04T02:37:18.497
cusolver/dense                               (11) |   193.07 |   0.13 |  0.1 |    2519.16 |      N/A |  13.88 |  7.2 |   29851.59 |  2153.11 |
device/intrinsics/memory                     (11) |    24.21 |   0.00 |  0.0 |       0.02 |      N/A |   1.58 |  6.5 |    3829.37 |  2361.20 |
device/random                                (14) |    48.45 |   0.00 |  0.0 |       0.17 |      N/A |   2.81 |  5.8 |    7153.73 |  1238.09 |
device/intrinsics/output                     (11) |    25.76 |   0.00 |  0.0 |       0.00 |      N/A |   1.86 |  7.2 |    4099.07 |  2361.20 |
device/intrinsics/atomics                    (13) |    78.17 |   0.00 |  0.0 |       0.00 |      N/A |   5.07 |  6.5 |   12078.32 |  1508.60 |
device/intrinsics/math                       (15) |    70.00 |   0.10 |  0.1 |       0.00 |      N/A |   4.00 |  5.7 |   10741.26 |  1336.75 |
device/intrinsics/wmma                       (14) |    89.49 |   0.01 |  0.0 |       0.63 |      N/A |   5.57 |  6.2 |   14501.88 |  1808.08 |
Testing finished in 50 minutes, 55 seconds, 583 milliseconds
gpuarrays/statistics: Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(5)
gpuarrays/linalg/norm: Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(3)
gpuarrays/reductions/mapreducedim!: Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(2)
Worker 8 failed running test codegen:
Some tests did not pass: 13 passed, 0 failed, 3 errored, 0 broken.
codegen: Error During Test at /home/sunjin/.julia/packages/CUDA/ifpPv/test/codegen.jl:170
  Test threw exception
  Expression: CUDA.code_sass(devnull, valid_kernel, Tuple{}) == nothing
  CUPTIError: CUPTI is unable to initialize its connection to the CUDA driver (code 15, CUPTI_ERROR_NOT_INITIALIZED)
  Stacktrace:
    [1] (::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{var"#valid_kernel#71", Tuple{}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}})()
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:76
    [2] lock(f::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{var"#valid_kernel#71", Tuple{}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}}, l::ReentrantLock)
      @ Base ./lock.jl:185
    [3] #code_sass#228
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:69 [inlined]
    [4] code_sass(io::Base.DevNull, func::Any, types::Any, kernel::Bool; verbose::Bool, always_inline::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:51
    [5] code_sass (repeats 2 times)
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:45 [inlined]
    [6] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:464 [inlined]
    [7] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:170 [inlined]
    [8] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
    [9] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:167 [inlined]
   [10] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
   [11] top-level scope
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:166
codegen: Error During Test at /home/sunjin/.julia/packages/CUDA/ifpPv/test/codegen.jl:174
  Got exception outside of a @test
  CUPTIError: CUPTI is unable to initialize its connection to the CUDA driver (code 15, CUPTI_ERROR_NOT_INITIALIZED)
  Stacktrace:
    [1] (::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(kernel_341), Tuple{Ptr{Int64}}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}})()
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:76
    [2] lock(f::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(kernel_341), Tuple{Ptr{Int64}}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}}, l::ReentrantLock)
      @ Base ./lock.jl:185
    [3] #code_sass#228
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:69 [inlined]
    [4] code_sass(io::Base.DevNull, func::Any, types::Any, kernel::Bool; verbose::Bool, always_inline::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:51
    [5] code_sass (repeats 2 times)
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:45 [inlined]
    [6] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:179 [inlined]
    [7] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
    [8] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:175 [inlined]
    [9] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
   [10] top-level scope
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:166
   [11] include
      @ ./client.jl:476 [inlined]
   [12] #12
      @ ~/.julia/packages/CUDA/ifpPv/test/runtests.jl:101 [inlined]
   [13] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:55 [inlined]
   [14] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
   [15] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:55 [inlined]
   [16] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/src/utilities.jl:25 [inlined]
   [17] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/src/pool.jl:573 [inlined]
   [18] top-level scope
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:54
   [19] eval
      @ ./boot.jl:368 [inlined]
   [20] runtests(f::Function, name::String, time_source::Symbol, snoop::Nothing)
      @ Main ~/.julia/packages/CUDA/ifpPv/test/setup.jl:66
   [21] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
      @ Base ./essentials.jl:729
   [22] invokelatest(::Any, ::Any, ::Vararg{Any})
      @ Base ./essentials.jl:726
   [23] (::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}})()
      @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:285
   [24] run_work_thunk(thunk::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)
      @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:70
   [25] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:285 [inlined]
   [26] (::Distributed.var"#109#111"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()
      @ Distributed ./task.jl:484
codegen: Error During Test at /home/sunjin/.julia/packages/CUDA/ifpPv/test/codegen.jl:182
  Got exception outside of a @test
  CUPTIError: CUPTI doesn't allow multiple callback subscribers. Only a single subscriber can be registered at a time. (code 39, CUPTI_ERROR_MULTIPLE_SUBSCRIBERS_NOT_SUPPORTED)
  Stacktrace:
    [1] (::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{var"#kernel#73", Tuple{}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}})()
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:76
    [2] lock(f::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{var"#kernel#73", Tuple{}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}}, l::ReentrantLock)
      @ Base ./lock.jl:185
    [3] #code_sass#228
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:69 [inlined]
    [4] code_sass(io::Base.DevNull, func::Any, types::Any, kernel::Bool; verbose::Bool, always_inline::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:51
    [5] code_sass (repeats 2 times)
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:45 [inlined]
    [6] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:185 [inlined]
    [7] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
    [8] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:183 [inlined]
    [9] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
   [10] top-level scope
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:166
   [11] include
      @ ./client.jl:476 [inlined]
   [12] #12
      @ ~/.julia/packages/CUDA/ifpPv/test/runtests.jl:101 [inlined]
   [13] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:55 [inlined]
   [14] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
   [15] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:55 [inlined]
   [16] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/src/utilities.jl:25 [inlined]
   [17] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/src/pool.jl:573 [inlined]
   [18] top-level scope
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:54
   [19] eval
      @ ./boot.jl:368 [inlined]
   [20] runtests(f::Function, name::String, time_source::Symbol, snoop::Nothing)
      @ Main ~/.julia/packages/CUDA/ifpPv/test/setup.jl:66
   [21] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
      @ Base ./essentials.jl:729
   [22] invokelatest(::Any, ::Any, ::Vararg{Any})
      @ Base ./essentials.jl:726
   [23] (::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}})()
      @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:285
   [24] run_work_thunk(thunk::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)
      @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:70
   [25] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:285 [inlined]
   [26] (::Distributed.var"#109#111"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()
      @ Distributed ./task.jl:484
gpuarrays/broadcasting: Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(4)
examples: Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(6)
Worker 9 failed running test cublas:
Some tests did not pass: 2180 passed, 2 failed, 0 errored, 0 broken.
cublas: Test Failed at /home/sunjin/.julia/packages/CUDA/ifpPv/test/cublas.jl:1031
  Expression: C ≈ Array(dA)
   Evaluated: ComplexF32[0.22159632f0 + 0.2527002f0im 0.7149214f0 - 0.07908373f0im … 1.584436f0 - 1.0320874f0im 0.88353616f0 + 0.11684081f0im; 0.8638178f0 + 0.19264713f0im 0.8187425f0 - 0.2023008f0im … 1.1051666f0 - 1.1044672f0im 1.1477678f0 + 0.30134824f0im; … ; 0.36396125f0 + 0.2685322f0im 0.53985554f0 + 0.5504699f0im … 1.3564304f0 - 1.298666f0im 0.7817802f0 + 0.5078405f0im; 0.46989524f0 + 0.8126974f0im 1.1645075f0 - 0.007235131f0im … 0.8363035f0 - 1.4039177f0im 0.9743213f0 + 0.30272323f0im] ≈ ComplexF32[-0.101180695f0 + 0.32050708f0im 0.21519771f0 + 0.6863358f0im … -1.32294f0 + 1.3511004f0im 0.021058274f0 + 0.89097947f0im; 0.283161f0 + 0.83851886f0im 0.35609248f0 + 0.76450187f0im … -0.8393788f0 + 1.3178332f0im -0.12041329f0 + 1.1805433f0im; … ; -0.040902115f0 + 0.45044902f0im -0.43627942f0 + 0.6357054f0im … -1.0425543f0 + 1.5618956f0im -0.38096747f0 + 0.8508501f0im; -0.45127892f0 + 0.8231804f0im 0.2312188f0 + 1.1413448f0im … -0.5119798f0 + 1.5518585f0im -0.1485673f0 + 1.0093914f0im]
Stacktrace:
 [1] record(ts::Test.DefaultTestSet, t::Union{Test.Error, Test.Fail})
   @ Test /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:983
 [2] top-level scope
   @ ~/.julia/packages/CUDA/ifpPv/test/runtests.jl:524
 [3] include(fname::String)
   @ Base.MainInclude ./client.jl:476
 [4] top-level scope
   @ none:6
 [5] eval
   @ ./boot.jl:368 [inlined]
 [6] exec_options(opts::Base.JLOptions)
   @ Base ./client.jl:276
 [7] _start()
   @ Base ./client.jl:522
cublas: Test Failed at /home/sunjin/.julia/packages/CUDA/ifpPv/test/cublas.jl:1031
  Expression: C ≈ Array(dA)
   Evaluated: ...
Stacktrace:
 [1] record(ts::Test.DefaultTestSet, t::Union{Test.Error, Test.Fail})
   @ Test /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:983
 [2] top-level scope
   @ ~/.julia/packages/CUDA/ifpPv/test/runtests.jl:524
 [3] include(fname::String)
   @ Base.MainInclude ./client.jl:476
 [4] top-level scope
   @ none:6
 [5] eval
   @ ./boot.jl:368 [inlined]
 [6] exec_options(opts::Base.JLOptions)
   @ Base ./client.jl:276
 [7] _start()
   @ Base ./client.jl:522
Worker 10 failed running test execution:
Some tests did not pass: 65 passed, 0 failed, 1 errored, 0 broken.
execution: Error During Test at /home/sunjin/.julia/packages/CUDA/ifpPv/test/execution.jl:55
  Got exception outside of a @test
  CUPTIError: CUPTI is unable to initialize its connection to the CUDA driver (code 15, CUPTI_ERROR_NOT_INITIALIZED)
  Stacktrace:
    [1] (::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(dummy), Tuple{}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}})()
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:76
    [2] lock(f::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(dummy), Tuple{}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}}, l::ReentrantLock)
      @ Base ./lock.jl:185
    [3] #code_sass#228
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:69 [inlined]
    [4] code_sass(io::Base.DevNull, func::Any, types::Any, kernel::Bool; verbose::Bool, always_inline::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:51
    [5] code_sass (repeats 2 times)
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:45 [inlined]
    [6] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/execution.jl:61 [inlined]
    [7] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
    [8] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/execution.jl:56 [inlined]
    [9] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
   [10] top-level scope
      @ ~/.julia/packages/CUDA/ifpPv/test/execution.jl:7
   [11] include
      @ ./client.jl:476 [inlined]
   [12] #12
      @ ~/.julia/packages/CUDA/ifpPv/test/runtests.jl:101 [inlined]
   [13] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:55 [inlined]
   [14] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
   [15] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:55 [inlined]
   [16] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/src/utilities.jl:25 [inlined]
   [17] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/src/pool.jl:573 [inlined]
   [18] top-level scope
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:54
   [19] eval
      @ ./boot.jl:368 [inlined]
   [20] runtests(f::Function, name::String, time_source::Symbol, snoop::Nothing)
      @ Main ~/.julia/packages/CUDA/ifpPv/test/setup.jl:66
   [21] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
      @ Base ./essentials.jl:729
   [22] invokelatest(::Any, ::Any, ::Vararg{Any})
      @ Base ./essentials.jl:726
   [23] (::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}})()
      @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:285
   [24] run_work_thunk(thunk::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)
      @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:70
   [25] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:285 [inlined]
   [26] (::Distributed.var"#109#111"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()
      @ Distributed ./task.jl:484
cusolver/sparse: Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(7)
cusparse/linalg: Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(12)

Second half of the output


Test Summary:                                  |  Pass  Fail  Error  Broken  Total  Time
  Overall                                      | 15192     2     11       5  15210      
    initialization                             |    30                          30      
    gpuarrays/indexing scalar                  |   476                         476      
    gpuarrays/math/power                       |                              None      
    gpuarrays/linalg/mul!/vector-matrix        |   168                         168      
    gpuarrays/interface                        |     7                           7      
    gpuarrays/indexing multidimensional        |    46                          46      
    gpuarrays/linalg                           |   231                         231      
    gpuarrays/reductions/reducedim!            |   192                         192      
    gpuarrays/uniformscaling                   |    56                          56      
    gpuarrays/reductions/any all count         |   101                         101      
    gpuarrays/math/intrinsics                  |    12                          12      
    gpuarrays/statistics                       |                  1              1      
    gpuarrays/linalg/mul!/matrix-matrix        |   432                         432      
    gpuarrays/constructors                     |   899                         899      
    gpuarrays/random                           |    62                          62      
    gpuarrays/linalg/norm                      |                  1              1      
    gpuarrays/base                             |    75                          75      
    gpuarrays/reductions/minimum maximum extrema |   666                         666      
    gpuarrays/reductions/mapreduce             |   396                         396      
    gpuarrays/reductions/== isequal            |   312                         312      
    gpuarrays/reductions/reduce                |   264                         264      
    apiutils                                   |     6                           6      
    gpuarrays/reductions/mapreducedim!         |                  1              1      
    broadcast                                  |    22                          22      
    codegen                                    |    13            3             16      
    gpuarrays/broadcasting                     |                  1              1      
    array                                      |   364                         364      
    cudadrv                                    |   139                    1    140      
    curand                                     |     1                           1      
    cufft                                      |   177                         177      
    examples                                   |                  1              1      
    cusparse                                   |   949                         949      
    cublas                                     |  2180     2                  2182      
    execution                                  |    65            1             66      
    gpuarrays/reductions/sum prod              |   862                         862      
    iterator                                   |    37                          37      
    nvtx                                       |                              None      
    pointer                                    |    35                          35      
    exceptions                                 |    17                          17      
    pool                                       |    10                          10      
    nvml                                       |    11                          11      
    linalg                                     |    21                          21      
    random                                     |   117                         117      
    utils                                      |    55                          55      
    threading                                  |                              None      
    cusolver/multigpu                          |    30                          30      
    texture                                    |    38                    4     42      
    cusolver/sparse                            |                  1              1      
    cusparse/conversions                       |    62                          62      
    cusparse/device                            |    10                          10      
    cusparse/broadcast                         |    65                          65      
    sorting                                    |   272                         272      
    cusparse/generic                           |  1238                        1238      
    device/array                               |    20                          20      
    cusparse/interfaces                        |  1042                        1042      
    device/intrinsics                          |    38                          38      
    device/ldg                                 |    22                          22      
    cusparse/linalg                            |                  1              1      
    cusolver/dense                             |  1940                        1940      
    device/intrinsics/memory                   |    16                          16      
┌ Info: System information:
│ CUDA runtime 11.8, artifact installation
│ CUDA driver 12.0
│ NVIDIA driver 527.56.0
│ 
│ Libraries: 
│ - CUBLAS: 11.11.3
│ - CURAND: 10.3.0
│ - CUFFT: 10.9.0
│ - CUSOLVER: 11.4.1
│ - CUSPARSE: 11.7.5
│ - CUPTI: 18.0.0
│ - NVML: 12.0.0+525.65
│ 
│ Toolchain:
│ - Julia: 1.8.1
│ - LLVM: 13.0.1
│ - PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2
│ - Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86
│ 
│ 1 device:
└   0: NVIDIA GeForce RTX 3060 Laptop GPU (sm_86, 5.679 GiB / 6.000 GiB available)
[ Info: Testing using 1 device(s): 0. NVIDIA GeForce RTX 3060 Laptop GPU (UUID f780bc3c-a9f0-9c3d-e723-c01de3450fd7)
Worker 5 terminated.
UNHANDLED TASK ERROR: EOFError: read end of file
Stacktrace:
 [1] (::Base.var"#wait_locked#680")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base ./stream.jl:941
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base ./stream.jl:950
 [3] unsafe_read
   @ ./io.jl:759 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base ./io.jl:758
 [5] read!
   @ ./io.jl:760 [inlined]
 [6] deserialize_hdr_raw
   @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:172
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:133
 [9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed ./task.jl:484
Worker 3 terminated.
UNHANDLED TASK ERROR: EOFError: read end of file
Stacktrace:
 [1] (::Base.var"#wait_locked#680")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base ./stream.jl:941
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base ./stream.jl:950
 [3] unsafe_read
   @ ./io.jl:759 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base ./io.jl:758
 [5] read!
   @ ./io.jl:760 [inlined]
 [6] deserialize_hdr_raw
   @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:172
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:133
 [9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed ./task.jl:484
Worker 2 terminated.
UNHANDLED TASK ERROR: EOFError: read end of file
Stacktrace:
 [1] (::Base.var"#wait_locked#680")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base ./stream.jl:941
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base ./stream.jl:950
 [3] unsafe_read
   @ ./io.jl:759 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base ./io.jl:758
 [5] read!
   @ ./io.jl:760 [inlined]
 [6] deserialize_hdr_raw
   @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:172
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:133
 [9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed ./task.jl:484
Worker 4 terminated.
UNHANDLED TASK ERROR: EOFError: read end of file
Stacktrace:
 [1] (::Base.var"#wait_locked#680")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base ./stream.jl:941
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base ./stream.jl:950
 [3] unsafe_read
   @ ./io.jl:759 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base ./io.jl:758
 [5] read!
   @ ./io.jl:760 [inlined]
 [6] deserialize_hdr_raw
   @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:172
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:133
 [9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed ./task.jl:484
Worker 6 terminated.
UNHANDLED TASK ERROR: EOFError: read end of file
Stacktrace:
 [1] (::Base.var"#wait_locked#680")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base ./stream.jl:941
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base ./stream.jl:950
 [3] unsafe_read
   @ ./io.jl:759 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base ./io.jl:758
 [5] read!
   @ ./io.jl:760 [inlined]
 [6] deserialize_hdr_raw
   @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:172
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:133
 [9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed ./task.jl:484
Worker 7 terminated.
UNHANDLED TASK ERROR: EOFError: read end of file
Stacktrace:
 [1] (::Base.var"#wait_locked#680")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base ./stream.jl:941
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base ./stream.jl:950
 [3] unsafe_read
   @ ./io.jl:759 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base ./io.jl:758
 [5] read!
   @ ./io.jl:760 [inlined]
 [6] deserialize_hdr_raw
   @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:172
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:133
 [9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed ./task.jl:484
Worker 12 terminated.
UNHANDLED TASK ERROR: EOFError: read end of file
Stacktrace:
 [1] (::Base.var"#wait_locked#680")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base ./stream.jl:941
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base ./stream.jl:950
 [3] unsafe_read
   @ ./io.jl:759 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base ./io.jl:758
 [5] read!
   @ ./io.jl:760 [inlined]
 [6] deserialize_hdr_raw
   @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:172
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:133
 [9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed ./task.jl:484
ERROR: LoadError: Test run finished with errors
in expression starting at /home/sunjin/.julia/packages/CUDA/ifpPv/test/runtests.jl:555
    device/random                              |   156                         156      
    device/intrinsics/output                   |    40                          40      
    device/intrinsics/atomics                  |   147                         147      
    device/intrinsics/math                     |   104                         104      
    device/intrinsics/wmma                     |   446                         446      
    e[31;1mFAILUREe[0m

Error in testset gpuarrays/statistics:
Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(5)
Error in testset gpuarrays/linalg/norm:
Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(3)
Error in testset gpuarrays/reductions/mapreducedim!:
Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(2)
Error in testset codegen:
Error During Test at /home/sunjin/.julia/packages/CUDA/ifpPv/test/codegen.jl:170
  Test threw exception
  Expression: CUDA.code_sass(devnull, valid_kernel, Tuple{}) == nothing
  CUPTIError: CUPTI is unable to initialize its connection to the CUDA driver (code 15, CUPTI_ERROR_NOT_INITIALIZED)
  Stacktrace:
    [1] (::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{var"#valid_kernel#71", Tuple{}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}})()
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:76
    [2] lock(f::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{var"#valid_kernel#71", Tuple{}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}}, l::ReentrantLock)
      @ Base ./lock.jl:185
    [3] #code_sass#228
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:69 [inlined]
    [4] code_sass(io::Base.DevNull, func::Any, types::Any, kernel::Bool; verbose::Bool, always_inline::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:51
    [5] code_sass (repeats 2 times)
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:45 [inlined]
    [6] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:464 [inlined]
    [7] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:170 [inlined]
    [8] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
    [9] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:167 [inlined]
   [10] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
   [11] top-level scope
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:166
Error in testset codegen:
Error During Test at /home/sunjin/.julia/packages/CUDA/ifpPv/test/codegen.jl:174
  Got exception outside of a @test
  CUPTIError: CUPTI is unable to initialize its connection to the CUDA driver (code 15, CUPTI_ERROR_NOT_INITIALIZED)
  Stacktrace:
    [1] (::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(kernel_341), Tuple{Ptr{Int64}}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}})()
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:76
    [2] lock(f::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(kernel_341), Tuple{Ptr{Int64}}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}}, l::ReentrantLock)
      @ Base ./lock.jl:185
    [3] #code_sass#228
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:69 [inlined]
    [4] code_sass(io::Base.DevNull, func::Any, types::Any, kernel::Bool; verbose::Bool, always_inline::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:51
    [5] code_sass (repeats 2 times)
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:45 [inlined]
    [6] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:179 [inlined]
    [7] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
    [8] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:175 [inlined]
    [9] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
   [10] top-level scope
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:166
   [11] include
      @ ./client.jl:476 [inlined]
   [12] #12
      @ ~/.julia/packages/CUDA/ifpPv/test/runtests.jl:101 [inlined]
   [13] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:55 [inlined]
   [14] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
   [15] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:55 [inlined]
   [16] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/src/utilities.jl:25 [inlined]
   [17] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/src/pool.jl:573 [inlined]
   [18] top-level scope
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:54
   [19] eval
      @ ./boot.jl:368 [inlined]
   [20] runtests(f::Function, name::String, time_source::Symbol, snoop::Nothing)
      @ Main ~/.julia/packages/CUDA/ifpPv/test/setup.jl:66
   [21] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
      @ Base ./essentials.jl:729
   [22] invokelatest(::Any, ::Any, ::Vararg{Any})
      @ Base ./essentials.jl:726
   [23] (::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}})()
      @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:285
   [24] run_work_thunk(thunk::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)
      @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:70
   [25] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:285 [inlined]
   [26] (::Distributed.var"#109#111"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()
      @ Distributed ./task.jl:484
Error in testset codegen:
Error During Test at /home/sunjin/.julia/packages/CUDA/ifpPv/test/codegen.jl:182
  Got exception outside of a @test
  CUPTIError: CUPTI doesn't allow multiple callback subscribers. Only a single subscriber can be registered at a time. (code 39, CUPTI_ERROR_MULTIPLE_SUBSCRIBERS_NOT_SUPPORTED)
  Stacktrace:
    [1] (::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{var"#kernel#73", Tuple{}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}})()
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:76
    [2] lock(f::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{var"#kernel#73", Tuple{}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}}, l::ReentrantLock)
      @ Base ./lock.jl:185
    [3] #code_sass#228
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:69 [inlined]
    [4] code_sass(io::Base.DevNull, func::Any, types::Any, kernel::Bool; verbose::Bool, always_inline::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:51
    [5] code_sass (repeats 2 times)
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:45 [inlined]
    [6] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:185 [inlined]
    [7] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
    [8] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:183 [inlined]
    [9] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
   [10] top-level scope
      @ ~/.julia/packages/CUDA/ifpPv/test/codegen.jl:166
   [11] include
      @ ./client.jl:476 [inlined]
   [12] #12
      @ ~/.julia/packages/CUDA/ifpPv/test/runtests.jl:101 [inlined]
   [13] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:55 [inlined]
   [14] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
   [15] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:55 [inlined]
   [16] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/src/utilities.jl:25 [inlined]
   [17] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/src/pool.jl:573 [inlined]
   [18] top-level scope
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:54
   [19] eval
      @ ./boot.jl:368 [inlined]
   [20] runtests(f::Function, name::String, time_source::Symbol, snoop::Nothing)
      @ Main ~/.julia/packages/CUDA/ifpPv/test/setup.jl:66
   [21] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
      @ Base ./essentials.jl:729
   [22] invokelatest(::Any, ::Any, ::Vararg{Any})
      @ Base ./essentials.jl:726
   [23] (::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}})()
      @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:285
   [24] run_work_thunk(thunk::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)
      @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:70
   [25] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:285 [inlined]
   [26] (::Distributed.var"#109#111"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()
      @ Distributed ./task.jl:484
Error in testset gpuarrays/broadcasting:
Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(4)
Error in testset examples:
Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(6)
Error in testset cublas:
Test Failed at /home/sunjin/.julia/packages/CUDA/ifpPv/test/cublas.jl:1031
  Expression: C ≈ Array(dA)
   Evaluated: ComplexF32[...]
Error in testset execution:
Error During Test at /home/sunjin/.julia/packages/CUDA/ifpPv/test/execution.jl:55
  Got exception outside of a @test
  CUPTIError: CUPTI is unable to initialize its connection to the CUDA driver (code 15, CUPTI_ERROR_NOT_INITIALIZED)
  Stacktrace:
    [1] (::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(dummy), Tuple{}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}})()
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:76
    [2] lock(f::CUDA.var"#229#231"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(dummy), Tuple{}}}, Ptr{Nothing}, Base.RefValue{Any}, NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}}}, l::ReentrantLock)
      @ Base ./lock.jl:185
    [3] #code_sass#228
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:69 [inlined]
    [4] code_sass(io::Base.DevNull, func::Any, types::Any, kernel::Bool; verbose::Bool, always_inline::Bool, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
      @ CUDA ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:51
    [5] code_sass (repeats 2 times)
      @ ~/.julia/packages/CUDA/ifpPv/src/compiler/reflection.jl:45 [inlined]
    [6] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/execution.jl:61 [inlined]
    [7] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
    [8] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/execution.jl:56 [inlined]
    [9] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
   [10] top-level scope
      @ ~/.julia/packages/CUDA/ifpPv/test/execution.jl:7
   [11] include
      @ ./client.jl:476 [inlined]
   [12] #12
      @ ~/.julia/packages/CUDA/ifpPv/test/runtests.jl:101 [inlined]
   [13] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:55 [inlined]
   [14] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Test/src/Test.jl:1357 [inlined]
   [15] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:55 [inlined]
   [16] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/src/utilities.jl:25 [inlined]
   [17] macro expansion
      @ ~/.julia/packages/CUDA/ifpPv/src/pool.jl:573 [inlined]
   [18] top-level scope
      @ ~/.julia/packages/CUDA/ifpPv/test/setup.jl:54
   [19] eval
      @ ./boot.jl:368 [inlined]
   [20] runtests(f::Function, name::String, time_source::Symbol, snoop::Nothing)
      @ Main ~/.julia/packages/CUDA/ifpPv/test/setup.jl:66
   [21] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
      @ Base ./essentials.jl:729
   [22] invokelatest(::Any, ::Any, ::Vararg{Any})
      @ Base ./essentials.jl:726
   [23] (::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}})()
      @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:285
   [24] run_work_thunk(thunk::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)
      @ Distributed /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:70
   [25] macro expansion
      @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:285 [inlined]
   [26] (::Distributed.var"#109#111"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()
      @ Distributed ./task.jl:484
Error in testset cusolver/sparse:
Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(7)
Error in testset cusparse/linalg:
Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(12)
ERROR: LoadError: Package CUDA errored during testing
Stacktrace:
  [1] pkgerror(msg::String)
    @ Pkg.Types /opt/julia-1.8.1/share/julia/stdlib/v1.8/Pkg/src/Types.jl:67
  [2] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, julia_args::Cmd, test_args::Cmd, test_fn::Nothing, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool)
    @ Pkg.Operations /opt/julia-1.8.1/share/julia/stdlib/v1.8/Pkg/src/Operations.jl:1813
  [3] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, test_fn::Nothing, julia_args::Cmd, test_args::Cmd, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool, kwargs::Base.Pairs{Symbol, IOStream, Tuple{Symbol}, NamedTuple{(:io,), Tuple{IOStream}}})
    @ Pkg.API /opt/julia-1.8.1/share/julia/stdlib/v1.8/Pkg/src/API.jl:431
  [4] test(pkgs::Vector{Pkg.Types.PackageSpec}; io::IOStream, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Pkg.API /opt/julia-1.8.1/share/julia/stdlib/v1.8/Pkg/src/API.jl:156
  [5] test(pkgs::Vector{Pkg.Types.PackageSpec})
    @ Pkg.API /opt/julia-1.8.1/share/julia/stdlib/v1.8/Pkg/src/API.jl:145
  [6] #test#87
    @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Pkg/src/API.jl:144 [inlined]
  [7] test
    @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Pkg/src/API.jl:144 [inlined]
  [8] #test#86
    @ /opt/julia-1.8.1/share/julia/stdlib/v1.8/Pkg/src/API.jl:143 [inlined]
  [9] test(pkg::String)
    @ Pkg.API /opt/julia-1.8.1/share/julia/stdlib/v1.8/Pkg/src/API.jl:143
 [10] top-level scope
    @ ~/julia/test.jl:4
in expression starting at /home/sunjin/julia/test.jl:4

Something seems up with your driver, resulting in CUPTI not being available. This library is used for reflecting on native code, so its unavailability will break a bunch of tests.