CUDA tests failing in WSL

I’m running into errors running the CUDA.jl tests using WSL and Julia 1.11.1. Is WSL a supported platform?

I’ve set export LD_LIBRARY_PATH="/usr/lib/wsl/lib/:$LD_LIBRARY_PATH", as suggested here: How to make CUDA.jl work in WSL2 - #6 by zwwi.

Also, I’ve successfully built all the samples at GitHub - NVIDIA/cuda-samples: Samples for CUDA Developers which demonstrates features in CUDA Toolkit, so it doesn’t seem to be a system issue.

Here’s what I get when I try running the tests:

(@v1.11) pkg> test CUDA
     Testing CUDA
      Status `/tmp/jl_2H5m5h/Project.toml`
  [621f4979] AbstractFFTs v1.5.0
  [79e6a3ab] Adapt v4.0.4
  [ab4f0b2a] BFloat16s v0.5.0
  [052768ef] CUDA v5.5.2
  [d360d2e6] ChainRulesCore v1.25.0
  [864edb3b] DataStructures v0.18.20
  [7a1cc6ca] FFTW v1.8.0
⌅ [0c68f7d7] GPUArrays v10.3.1
⌅ [61eb1bfa] GPUCompiler v0.27.8
  [a98d9a8b] Interpolations v0.15.1
  [63c18a36] KernelAbstractions v0.9.28
  [5da4648a] NVTX v0.3.4
  [276daf66] SpecialFunctions v2.4.0
  [90137ffa] StaticArrays v1.9.7
  [10745b16] Statistics v1.11.1
  [4ee394cb] CUDA_Driver_jll v0.10.3+0
  [76a88914] CUDA_Runtime_jll v0.15.3+0
  [ade2ca70] Dates v1.11.0
  [8ba89e20] Distributed v1.11.0
  [b77e0a4c] InteractiveUtils v1.11.0
  [37e2e46d] LinearAlgebra v1.11.0
  [44cfe95a] Pkg v1.11.0
  [de0858da] Printf v1.11.0
  [3fa0cd96] REPL v1.11.0
  [9a3f8284] Random v1.11.0
  [2f01184e] SparseArrays v1.11.0
  [8dfed614] Test v1.11.0
      Status `/tmp/jl_2H5m5h/Manifest.toml`
  [621f4979] AbstractFFTs v1.5.0
  [79e6a3ab] Adapt v4.0.4
  [a9b6321e] Atomix v0.1.0
  [13072b0f] AxisAlgorithms v1.1.0
  [ab4f0b2a] BFloat16s v0.5.0
  [fa961155] CEnum v0.5.0
  [052768ef] CUDA v5.5.2
  [1af6417a] CUDA_Runtime_Discovery v0.3.5
  [d360d2e6] ChainRulesCore v1.25.0
⌅ [3da002f7] ColorTypes v0.11.5
  [5ae59095] Colors v0.12.11
  [34da2185] Compat v4.16.0
  [a8cc5b0e] Crayons v4.1.1
  [9a962f9c] DataAPI v1.16.0
  [a93c6f00] DataFrames v1.7.0
  [864edb3b] DataStructures v0.18.20
  [e2d170a0] DataValueInterfaces v1.0.0
  [ffbed154] DocStringExtensions v0.9.3
  [e2ba6199] ExprTools v0.1.10
  [7a1cc6ca] FFTW v1.8.0
  [53c48c17] FixedPointNumbers v0.8.5
⌅ [0c68f7d7] GPUArrays v10.3.1
⌅ [46192b85] GPUArraysCore v0.1.6
⌅ [61eb1bfa] GPUCompiler v0.27.8
  [842dd82b] InlineStrings v1.4.2
  [a98d9a8b] Interpolations v0.15.1
  [41ab1584] InvertedIndices v1.3.0
  [92d709cd] IrrationalConstants v0.2.2
  [82899510] IteratorInterfaceExtensions v1.0.0
  [692b3bcd] JLLWrappers v1.6.1
  [63c18a36] KernelAbstractions v0.9.28
  [929cbde3] LLVM v9.1.2
  [8b046642] LLVMLoopInfo v1.0.0
  [b964fa9f] LaTeXStrings v1.4.0
  [2ab3a3ac] LogExpFunctions v0.3.28
  [1914dd2f] MacroTools v0.5.13
  [e1d29d7a] Missings v1.2.0
  [5da4648a] NVTX v0.3.4
  [6fe1bfb0] OffsetArrays v1.14.1
  [bac558e1] OrderedCollections v1.6.3
  [2dfb63ee] PooledArrays v1.4.3
  [aea7be01] PrecompileTools v1.2.1
  [21216c6a] Preferences v1.4.3
  [08abe8d2] PrettyTables v2.4.0
  [74087812] Random123 v1.7.0
  [e6cf234a] RandomNumbers v1.6.0
  [c84ed2f1] Ratios v0.4.5
  [189a3867] Reexport v1.2.2
  [ae029012] Requires v1.3.0
  [6c6a2e73] Scratch v1.2.1
  [91c51154] SentinelArrays v1.4.5
  [a2af1166] SortingAlgorithms v1.2.1
  [276daf66] SpecialFunctions v2.4.0
  [90137ffa] StaticArrays v1.9.7
  [1e83bf80] StaticArraysCore v1.4.3
  [10745b16] Statistics v1.11.1
  [892a3eda] StringManipulation v0.4.0
  [3783bdb8] TableTraits v1.0.1
  [bd369af6] Tables v1.12.0
  [a759f4b9] TimerOutputs v0.5.25
  [013be700] UnsafeAtomics v0.2.1
  [d80eeb9a] UnsafeAtomicsLLVM v0.2.1
  [efce3f68] WoodburyMatrices v1.0.0
  [4ee394cb] CUDA_Driver_jll v0.10.3+0
  [76a88914] CUDA_Runtime_jll v0.15.3+0
  [f5851436] FFTW_jll v3.3.10+1
  [1d5cc7b8] IntelOpenMP_jll v2024.2.1+0
  [9c1d0b0a] JuliaNVTXCallbacks_jll v0.2.1+0
  [dad2f222] LLVMExtra_jll v0.0.34+0
  [856f044c] MKL_jll v2024.2.0+0
  [e98f9f5b] NVTX_jll v3.1.0+2
  [efe28fd5] OpenSpecFun_jll v0.5.5+0
  [1e29f10c] demumble_jll v1.3.0+0
  [1317d2d5] oneTBB_jll v2021.12.0+0
  [0dad84c5] ArgTools v1.1.2
  [56f22d72] Artifacts v1.11.0
  [2a0f44e3] Base64 v1.11.0
  [ade2ca70] Dates v1.11.0
  [8ba89e20] Distributed v1.11.0
  [f43a241f] Downloads v1.6.0
  [7b1f6079] FileWatching v1.11.0
  [9fa8497b] Future v1.11.0
  [b77e0a4c] InteractiveUtils v1.11.0
  [4af54fe1] LazyArtifacts v1.11.0
  [b27032c2] LibCURL v0.6.4
  [76f85450] LibGit2 v1.11.0
  [8f399da3] Libdl v1.11.0
  [37e2e46d] LinearAlgebra v1.11.0
  [56ddb016] Logging v1.11.0
  [d6f4376e] Markdown v1.11.0
  [a63ad114] Mmap v1.11.0
  [ca575930] NetworkOptions v1.2.0
  [44cfe95a] Pkg v1.11.0
  [de0858da] Printf v1.11.0
  [3fa0cd96] REPL v1.11.0
  [9a3f8284] Random v1.11.0
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization v1.11.0
  [1a1011a3] SharedArrays v1.11.0
  [6462fe0b] Sockets v1.11.0
  [2f01184e] SparseArrays v1.11.0
  [f489334b] StyledStrings v1.11.0
  [fa267f1f] TOML v1.0.3
  [a4e569a6] Tar v1.10.0
  [8dfed614] Test v1.11.0
  [cf7118a7] UUIDs v1.11.0
  [4ec0a83e] Unicode v1.11.0
  [e66e0078] CompilerSupportLibraries_jll v1.1.1+0
  [deac9b47] LibCURL_jll v8.6.0+0
  [e37daf67] LibGit2_jll v1.7.2+0
  [29816b5a] LibSSH2_jll v1.11.0+1
  [c8ffd9c3] MbedTLS_jll v2.28.6+0
  [14a3606d] MozillaCACerts_jll v2023.12.12
  [4536629a] OpenBLAS_jll v0.3.27+1
  [05823500] OpenLibm_jll v0.8.1+2
  [bea87d4a] SuiteSparse_jll v7.7.0+0
  [83775a58] Zlib_jll v1.2.13+1
  [8e850b90] libblastrampoline_jll v5.11.0+0
  [8e850ede] nghttp2_jll v1.59.0+0
  [3f19e933] p7zip_jll v17.4.0+2
        Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading.
     Testing Running tests...
┌ Info: System information:
│ CUDA runtime 12.6, local installation
│ CUDA driver 12.7
│ NVIDIA driver 565.90.0
│
│ CUDA libraries:
│ - CUBLAS: 12.6.3
│ - CURAND: 10.3.7
│ - CUFFT: 11.3.0
│ - CUSOLVER: 11.7.1
│ - CUSPARSE: 12.5.4
│ - CUPTI: 2024.3.2 (API 24.0.0)
│ - NVML: 12.0.0+565.51.1
│
│ Julia packages:
│ - CUDA: 5.5.2
│ - CUDA_Driver_jll: 0.10.3+0
│ - CUDA_Runtime_jll: 0.15.3+0
│ - CUDA_Runtime_Discovery: 0.3.5
│
│ Toolchain:
│ - Julia: 1.11.1
│ - LLVM: 16.0.6
│
│ Preferences:
│ - CUDA_Runtime_jll.local: true
│
│ 1 device:
└   0: NVIDIA GeForce RTX 2080 Ti (sm_75, 8.542 GiB / 11.000 GiB available)
[ Info: Testing using device 0 (NVIDIA GeForce RTX 2080 Ti). To change this, specify the `--gpu` argument to the tests, or set the `CUDA_VISIBLE_DEVICES` environment variable.
[ Info: Running 4 tests in parallel. If this is too many, specify the `--jobs` argument to the tests, or set the `JULIA_CPU_THREADS` environment variable.
                                                  |          | ---------------- GPU ---------------- | ---------------- CPU ---------------- |
Test                                     (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB) | GC (s) | GC % | Alloc (MB) | RSS (MB) |
core/initialization                           (2) |         failed at 2024-10-21T12:06:46.631
gpuarrays/reductions/reduce                   (4) |   118.96 |   0.01 |  0.0 |       1.21 |      N/A |   2.09 |  1.8 |   11273.36 |  1867.41 |
gpuarrays/reductions/mapreducedim!            (5) |   128.58 |   0.01 |  0.0 |       1.54 |      N/A |   1.81 |  1.4 |    8868.55 |  1941.51 |
gpuarrays/reductions/sum prod                 (3) |   151.83 |   0.02 |  0.0 |       3.24 |      N/A |   2.51 |  1.7 |   12015.22 |  2968.30 |
gpuarrays/reductions/== isequal               (4) |    40.74 |   0.01 |  0.0 |       1.07 |      N/A |   0.59 |  1.5 |    4505.22 |  2303.67 |
gpuarrays/vectors                             (4) |     0.31 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |      25.45 |  2303.96 |
gpuarrays/base                                (5) |    33.61 |   0.00 |  0.0 |       8.90 |      N/A |   0.81 |  2.4 |    4186.80 |  2335.31 |
gpuarrays/random                              (3) |    17.38 |   0.00 |  0.0 |       0.03 |      N/A |   0.13 |  0.8 |    1172.08 |  3273.52 |
gpuarrays/constructors                        (4) |    27.21 |   0.01 |  0.0 |       0.65 |      N/A |   0.16 |  0.6 |    1437.16 |  2503.64 |
gpuarrays/statistics                          (3) |    54.56 |   0.00 |  0.0 |       1.51 |      N/A |   0.70 |  1.3 |    4439.07 |  4071.43 |
gpuarrays/math/intrinsics                     (3) |     3.58 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |     212.71 |  4071.46 |
gpuarrays/broadcasting                        (6) |   211.51 |   0.02 |  0.0 |       2.00 |      N/A |   2.85 |  1.3 |   14374.53 |  2773.32 |
gpuarrays/reductions/mapreduce                (5) |    80.73 |   0.01 |  0.0 |       1.81 |      N/A |   1.66 |  2.1 |    8869.94 |  2906.58 |
gpuarrays/uniformscaling                      (5) |     7.65 |   0.00 |  0.0 |       0.01 |      N/A |   0.05 |  0.6 |     446.61 |  3167.45 |
gpuarrays/reductions/mapreducedim!_large      (6) |    28.15 |   0.00 |  0.0 |     818.34 |      N/A |   0.77 |  2.8 |    3797.14 |  3201.48 |
gpuarrays/reductions/any all count            (6) |    11.32 |   0.00 |  0.0 |       0.00 |      N/A |   0.11 |  1.0 |    1044.87 |  3201.48 |
gpuarrays/interface                           (6) |     1.73 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |     136.67 |  3201.48 |
gpuarrays/linalg/mul!/matrix-matrix           (3) |    69.24 |   0.02 |  0.0 |       0.12 |      N/A |   0.99 |  1.4 |    6692.11 |  4838.01 |
gpuarrays/indexing multidimensional           (6) |    39.67 |   0.00 |  0.0 |       2.07 |      N/A |   0.49 |  1.2 |    3840.19 |  3687.58 |
gpuarrays/indexing find                       (3) |    17.10 |   0.00 |  0.0 |       0.13 |      N/A |   0.21 |  1.2 |    1588.11 |  5136.23 |
gpuarrays/linalg/norm                         (4) |   141.67 |   0.01 |  0.0 |       0.02 |      N/A |   2.22 |  1.6 |   11781.15 |  5436.08 |
gpuarrays/math/power                          (3) |    19.59 |   0.00 |  0.0 |       0.01 |      N/A |   0.22 |  1.1 |    1813.39 |  5290.36 |
gpuarrays/linalg/mul!/vector-matrix           (6) |    40.18 |   0.01 |  0.0 |       0.02 |      N/A |   0.59 |  1.5 |    4274.25 |  4240.55 |
gpuarrays/indexing scalar                     (6) |     9.61 |   0.00 |  0.0 |       0.01 |      N/A |   0.07 |  0.7 |     740.45 |  4558.29 |
gpuarrays/reductions/reducedim!               (3) |    51.98 |   0.00 |  0.0 |       1.03 |      N/A |   0.52 |  1.0 |    3297.56 |  5560.60 |
gpuarrays/reductions/minimum maximum extrema  (5) |   140.68 |   0.01 |  0.0 |       2.19 |      N/A |   2.50 |  1.8 |   12240.49 |  5110.31 |
gpuarrays/linalg                              (4) |    97.66 |   0.01 |  0.0 |      26.35 |      N/A |   2.73 |  2.8 |    9171.38 |  6019.63 |
      From worker 4:    WARNING: Method definition var"#3699#kernel"(Any) in module Main at /home/dfenn/.julia/packages/CUDA/2kjXI/test/core/execution.jl:360 overwritten at /home/dfenn/.julia/packages/CUDA/2kjXI/test/core/execution.jl:368.
core/execution                                (4) |    29.67 |   0.00 |  0.0 |       0.02 |      N/A |   0.35 |  1.2 |    2098.18 |  6144.21 |
libraries/cusparse                            (3) |   121.59 |   0.04 |  0.0 |      12.58 |      N/A |   1.15 |  0.9 |    6692.43 |  5845.63 |
libraries/cusolver/dense                      (5) |   156.59 |   0.09 |  0.1 |     262.80 |      N/A |   3.40 |  2.2 |   13940.58 |  5692.39 |
libraries/cublas                              (6) |   193.50 |   0.05 |  0.0 |      43.04 |      N/A |   3.82 |  2.0 |   18481.35 |  4657.56 |
      From worker 6:    WARNING: using CUSPARSE.axpby! in module Main conflicts with an existing identifier.
core/cudadrv                                  (5) |     8.61 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |     321.93 |  5692.39 |
libraries/cusparse/interfaces                 (4) |         failed at 2024-10-21T12:16:15.553
Worker 4 terminated.
base/array                                    (3) |    61.91 |   0.03 |  0.0 |    1286.68 |      N/A |   0.86 |  1.4 |    5580.96 |  5845.63 |
Unhandled Task ERROR: EOFError: read end of file
Stacktrace:
 [1] (::Base.var"#wait_locked#832")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base ./stream.jl:970
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base ./stream.jl:978
 [3] unsafe_read
   @ ./io.jl:891 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base ./io.jl:890
 [5] read!
   @ ./io.jl:895 [inlined]
 [6] deserialize_hdr_raw
   @ ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/process_messages.jl:172
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/process_messages.jl:133
 [9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/process_messages.jl:121
core/device/intrinsics/atomics                (3) |    18.10 |   0.00 |  0.0 |       0.00 |      N/A |   0.08 |  0.4 |     959.16 |  5845.63 |
libraries/cusparse/generic                    (6) |    68.32 |   0.06 |  0.1 |       5.51 |      N/A |   0.75 |  1.1 |    3843.07 |  4657.56 |
libraries/cusparse/conversions                (6) |    10.34 |   0.01 |  0.1 |       1.69 |      N/A |   0.12 |  1.2 |     953.24 |  4850.63 |
base/sorting                                  (5) |    97.86 |   0.01 |  0.0 |     668.44 |      N/A |   5.00 |  5.1 |   13480.95 |  8613.95 |
core/device/intrinsics/wmma                   (7) |    74.21 |   0.02 |  0.0 |       0.63 |      N/A |   1.01 |  1.4 |    5513.40 |  2395.82 |
core/device/intrinsics                        (5) |    13.63 |   0.00 |  0.0 |       0.00 |      N/A |   0.11 |  0.8 |     787.93 |  8613.95 |
core/device/intrinsics/cooperative_groups     (6) |    38.22 |   0.01 |  0.0 |      19.75 |      N/A |   0.29 |  0.8 |    1903.93 |  6160.97 |
libraries/cufft                               (3) |    84.01 |   0.01 |  0.0 |     197.64 |      N/A |   0.94 |  1.1 |    4997.51 |  5845.63 |
libraries/cusolver/sparse                     (5) |    12.75 |   0.00 |  0.0 |       0.22 |      N/A |   0.00 |  0.0 |     563.70 |  8613.95 |
core/device/array                             (5) |     3.67 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |     239.57 |  8613.95 |
base/texture                                  (7) |    38.72 |   0.00 |  0.0 |       0.09 |      N/A |   0.50 |  1.3 |    3462.82 |  2597.63 |
core/device/intrinsics/memory                 (5) |     7.38 |   0.00 |  0.0 |       0.02 |      N/A |   0.00 |  0.0 |     402.08 |  8613.95 |
libraries/cusparse/bmm                        (6) |    26.29 |   0.03 |  0.1 |       0.90 |      N/A |   0.58 |  2.2 |    3719.68 |  6384.03 |
base/random                                   (3) |    25.72 |   0.07 |  0.3 |     256.59 |      N/A |   0.17 |  0.6 |    1458.46 |  5845.63 |
core/codegen                                  (7) |    19.25 |   0.03 |  0.1 |       0.00 |      N/A |   0.00 |  0.0 |     173.62 |  2597.63 |
core/device/intrinsics/output                 (3) |         failed at 2024-10-21T12:26:14.068
Worker 3 terminated.
Unhandled Task ERROR: EOFError: read end of file
Stacktrace:
 [1] (::Base.var"#wait_locked#832")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base ./stream.jl:970
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base ./stream.jl:978
 [3] unsafe_read
   @ ./io.jl:891 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base ./io.jl:890
 [5] read!
   @ ./io.jl:895 [inlined]
 [6] deserialize_hdr_raw
   @ ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/process_messages.jl:172
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/process_messages.jl:133
 [9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/process_messages.jl:121
libraries/cusolver/dense_generic              (5) |   468.10 |   7.14 |  1.5 |       0.30 |      N/A |   0.16 |  0.0 |     872.16 |  8613.95 |
core/device/random                            (7) |   446.30 |   3.30 |  0.7 |       0.17 |      N/A |   0.14 |  0.0 |     929.84 |  3038.19 |
core/device/ldg                               (5) |     6.96 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |     520.83 |  8613.95 |
core/pointer                                  (7) |     0.36 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |      11.71 |  3138.70 |
core/nvml                                     (7) |     1.06 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |      62.21 |  3140.75 |
core/device/intrinsics/math                   (6) |   477.15 |   4.18 |  0.9 |       0.00 |      N/A |   0.35 |  0.1 |    2111.20 |  7223.90 |
base/broadcast                                (5) |    14.01 |   0.00 |  0.0 |       0.00 |      N/A |   0.11 |  0.8 |    1037.20 |  8613.95 |
libraries/cusolver/sparse_factorizations      (5) |    14.53 |   0.00 |  0.0 |       4.89 |      N/A |   0.19 |  1.3 |    1594.00 |  8613.95 |
libraries/cusolver/multigpu                   (8) |    46.60 |   0.01 |  0.0 |     545.60 |      N/A |   0.86 |  1.9 |    3451.87 |  1446.05 |
libraries/cusparse/linalg                     (6) |         failed at 2024-10-21T12:27:19.443
Worker 6 terminated.
Unhandled Task ERROR: EOFError: read end of file
Stacktrace:
 [1] (::Base.var"#wait_locked#832")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base ./stream.jl:970
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base ./stream.jl:978
 [3] unsafe_read
   @ ./io.jl:891 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base ./io.jl:890
 [5] read!
   @ ./io.jl:895 [inlined]
 [6] deserialize_hdr_raw
   @ ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/process_messages.jl:172
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/process_messages.jl:133
 [9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/Distributed/src/process_messages.jl:121
base/iterator                                 (8) |     6.16 |   0.01 |  0.1 |       1.93 |      N/A |   0.05 |  0.8 |     493.87 |  1446.05 |
core/utils                                    (8) |     1.33 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |      76.02 |  1446.05 |
libraries/cusparse/device                     (8) |     1.61 |   0.00 |  0.0 |       0.01 |      N/A |   0.00 |  0.0 |      99.34 |  1446.05 |
libraries/staticarrays                        (8) |     1.50 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |     219.22 |  1446.05 |
base/threading                                (9) |    20.92 |   0.01 |  0.0 |      10.94 |      N/A |   0.19 |  0.9 |    1292.12 |  1220.85 |
base/exceptions                               (7) |    83.41 |   0.28 |  0.3 |       0.00 |      N/A |   0.00 |  0.0 |      11.84 |  3143.55 |
core/pool                                     (9) |     3.49 |   0.00 |  0.0 |       0.00 |      N/A |   0.39 | 11.2 |     378.99 |  1260.82 |
core/apiutils                                 (9) |     0.15 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |       1.09 |  1260.82 |
libraries/cusparse/broadcast                  (8) |    31.24 |   0.00 |  0.0 |       0.05 |      N/A |   0.34 |  1.1 |    2625.45 |  1446.05 |
libraries/cusparse/reduce                     (8) |     0.01 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |       0.27 |  1446.05 |
core/profile                                  (5) |    61.33 |   0.00 |  0.0 |       0.00 |      N/A |   2.31 |  3.8 |    7032.58 |  8613.95 |
libraries/curand                              (5) |     0.13 |   0.00 |  0.0 |       0.00 |      N/A |   0.00 |  0.0 |       5.60 |  8613.95 |
base/examples                                 (9) |     8.60 |   7.43 | 86.4 |       0.00 |      N/A |   0.43 |  5.1 |    1463.25 |  1854.25 |
base/linalg                                   (7) |    42.05 |   0.01 |  0.0 |    1547.52 |      N/A |   2.73 |  6.5 |    7297.39 |  4829.13 |
base/kernelabstractions                       (8) |    44.22 |   0.00 |  0.0 |      71.01 |      N/A |   0.78 |  1.8 |    3875.30 |  1922.15 |
Testing finished in 22 minutes, 12 seconds, 420 milliseconds
Worker 2 failed running test core/initialization:
Some tests did not pass: 33 passed, 1 failed, 0 errored, 0 broken.
core/initialization: Test Failed at /home/dfenn/.julia/packages/CUDA/2kjXI/test/core/initialization.jl:58
  Expression: haskey(NVML.compute_processes(nvml_dev), pid)

Stacktrace:
 [1] record(ts::Test.DefaultTestSet, t::Union{Test.Error, Test.Fail}; print_result::Bool)
   @ Test ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/Test/src/Test.jl:1107
 [2] record(ts::Test.DefaultTestSet, t::Union{Test.Error, Test.Fail})
   @ Test ~/.julia/juliaup/julia-1.11.1+0.x64.linux.gnu/share/julia/stdlib/v1.11/Test/src/Test.jl:1100
 [3] top-level scope
   @ ~/.julia/packages/CUDA/2kjXI/test/runtests.jl:470
 [4] include(fname::String)
   @ Main ./sysimg.jl:38
 [5] top-level scope
   @ none:6
 [6] eval
   @ ./boot.jl:430 [inlined]
 [7] exec_options(opts::Base.JLOptions)
   @ Base ./client.jl:296
 [8] _start()
   @ Base ./client.jl:531
libraries/cusparse/interfaces: Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(4)
core/device/intrinsics/output: Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(3)
libraries/cusparse/linalg: Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(6)

Test Summary:                                  |  Pass  Fail  Error  Broken  Total  Time
  Overall                                      | 23242     1      3      11  23257
    core/initialization                        |    33     1                    34
    gpuarrays/reductions/reduce                |   264                         264
    gpuarrays/reductions/mapreducedim!         |   312                         312
    gpuarrays/reductions/sum prod              |   862                         862
    gpuarrays/reductions/== isequal            |   312                         312
    gpuarrays/vectors                          |    10                          10
    gpuarrays/base                             |    96                          96
    gpuarrays/random                           |    64                          64
    gpuarrays/constructors                     |   966                         966
    gpuarrays/statistics                       |    84                          84
    gpuarrays/math/intrinsics                  |    12                          12
    gpuarrays/broadcasting                     |   364                         364
    gpuarrays/reductions/mapreduce             |   396                         396
    gpuarrays/uniformscaling                   |    56                          56
    gpuarrays/reductions/mapreducedim!_large   |    50                          50
    gpuarrays/reductions/any all count         |   101                         101
    gpuarrays/interface                        |     7                           7
    gpuarrays/linalg/mul!/matrix-matrix        |   432                         432
    gpuarrays/indexing multidimensional        |   101                         101
    gpuarrays/indexing find                    |    45                          45
    gpuarrays/linalg/norm                      |   696                         696
    gpuarrays/math/power                       |    72                          72
    gpuarrays/linalg/mul!/vector-matrix        |   168                         168
    gpuarrays/indexing scalar                  |   477                         477
    gpuarrays/reductions/reducedim!            |   192                         192
    gpuarrays/reductions/minimum maximum extrema |   666                         666
    gpuarrays/linalg                           |   443                         443
    core/execution                             |    81                          81
    libraries/cusparse                         |   871                         871
    libraries/cusolver/dense                   |  3948                        3948
    libraries/cublas                           |  3496                        3496
    core/cudadrv                               |   158                    3    161
    libraries/cusparse/interfaces              |                  1              1
    base/array                                 |   399                         399
    core/device/intrinsics/atomics             |   147                         147
    libraries/cusparse/generic                 |  1312                        1312
    libraries/cusparse/conversions             |   136                         136
    base/sorting                               |   276                         276
    core/device/intrinsics/wmma                |   446                         446
    core/device/intrinsics                     |    38                          38
    core/device/intrinsics/cooperative_groups  |   515                         515
    libraries/cufft                            |   368                         368
    libraries/cusolver/sparse                  |   112                         112
    core/device/array                          |    20                          20
    base/texture                               |    38                    4     42
    core/device/intrinsics/memory              |    16                          16
    libraries/cusparse/bmm                     |    40                          40
    base/random                                |   236                         236
    core/codegen                               |    15                          15
    core/device/intrinsics/output              |                  1              1
    libraries/cusolver/dense_generic           |   112                         112
    core/device/random                         |   156                         156
    core/device/ldg                            |    41                          41
    core/pointer                               |    35                          35
    core/nvml                                  |    28                          28
    core/device/intrinsics/math                |   112                         112
    base/broadcast                             |    32                          32
    libraries/cusolver/sparse_factorizations   |    36                          36
    libraries/cusolver/multigpu                |    30                          30
    libraries/cusparse/linalg                  |                  1              1
    base/iterator                              |    45                          45
    core/utils                                 |    52                          52
    libraries/cusparse/device                  |    10                          10
    libraries/staticarrays                     |     1                           1
    base/threading                             |                                 0
    base/exceptions                            |    21                          21
    core/pool                                  |    10                          10
    core/apiutils                              |     6                           6
    libraries/cusparse/broadcast               |    65                          65
    libraries/cusparse/reduce                  |                                 0
    core/profile                               |    25                          25
    libraries/curand                           |     1                           1
    base/examples                              |     5                           5
    base/linalg                                |    39                          39
    base/kernelabstractions                    |  2431                    4   2435
    FAILURE

Error in testset core/initialization:
Test Failed at /home/dfenn/.julia/packages/CUDA/2kjXI/test/core/initialization.jl:58
  Expression: haskey(NVML.compute_processes(nvml_dev), pid)

Error in testset libraries/cusparse/interfaces:
Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(4)
Error in testset core/device/intrinsics/output:
Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(3)
Error in testset libraries/cusparse/linalg:
Error During Test at none:1
  Got exception outside of a @test
  ProcessExitedException(6)
ERROR: LoadError: Test run finished with errors
in expression starting at /home/dfenn/.julia/packages/CUDA/2kjXI/test/runtests.jl:501
ERROR: Package CUDA errored during testing

I appreciate any help.

Looks like a couple of workers might have gotten killed – can you try with fewer of them (see the message at the start of the test run)?

The NVML failure is not a problem.

So apart from the workers getting killed, almost everything seems to be working just fine. It has been a while since I tested WSL myself, but there’s no reason to expect CUDA.jl wouldn’t work on it.

Running with fewer workers fixed it. Thank you for your help.