Errors reported during Pkg.test("CUDA")

Hello,

While I’ve been using Julia for a number of years now, this is my first attempt to use CUDA with Julia. I’ve encountered some errors while running the tests.

OS: Windows 11 | Version 10.0.26120 Build 26120

julia> using CUDA

(@v1.11) pkg> test CUDA
     Testing CUDA

Testing Running tests...
┌ Info: System information:
│ CUDA runtime 12.8, artifact installation
│ CUDA driver 12.8
│ NVIDIA driver 572.61.0
│
│ CUDA libraries:
│ - CUBLAS: 12.8.4
│ - CURAND: 10.3.9
│ - CUFFT: 11.3.3
│ - CUSOLVER: 11.7.3
│ - CUSPARSE: 12.5.8
│ - CUPTI: 2025.1.1 (API 26.0.0)
│ - NVML: 12.0.0+572.61
│
│ Julia packages:
│ - CUDA: 5.7.3
│ - CUDA_Driver_jll: 0.12.1+1
│ - CUDA_Runtime_jll: 0.16.1+0
│
│ Toolchain:
│ - Julia: 1.11.5
│ - LLVM: 16.0.6
│
│ 1 device:
└   0: NVIDIA GeForce GTX 1080 (sm_61, 7.283 GiB / 8.000 GiB available)
[ Info: Testing using device 0 (NVIDIA GeForce GTX 1080). 

.  .  .

 From worker 4:    WARNING: Method definition var"#5780#kernel"(Any) in module Main at C:\Users\. . .\.julia\packages\CUDA\oymHm\test\core\execution.jl:360 overwritten at C:\Users\. . . \.julia\packages\CUDA\oymHm\test\core\execution.jl:368.

.  .  .

base/examples                                 (6) |         failed at 2025-04-22T22:27:51.941
Worker 6 terminated.
Unhandled Task ERROR: EOFError: read end of file
Stacktrace:
 [1] (::Base.var"#wait_locked#832")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base .\stream.jl:979
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base .\stream.jl:987
 [3] unsafe_read
   @ .\io.jl:890 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base .\io.jl:889
 [5] read!
   @ .\io.jl:894 [inlined]
 [6] deserialize_hdr_raw
   @ C:\Users\Audrius Stundzia\AppData\Local\Programs\Julia-1.11.5\share\julia\stdlib\v1.11\Distributed\src\messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed C:\Users\Audrius Stundzia\AppData\Local\Programs\Julia-1.11.5\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:172
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed C:\Users\Audrius Stundzia\AppData\Local\Programs\Julia-1.11.5\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:133
 [9] (::Distributed.var"#103#104"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed C:\Users\Audrius Stundzia\AppData\Local\Programs\Julia-1.11.5\share\julia\stdlib\v1.11\Distributed\src\process_messages.jl:121

.  .  .

Test Summary:                                    |  Pass  Error  Broken  Total  Time
  Overall                                        | 28796      1      12  28809
 core/cudadrv                                    |  2074              3   2077
 base/texture                                    |    56              4     60
 core/nvml                                       |    27              1     28
 base/examples                                   |            1              1
 base/kernelabstractions                         |  2462              4   2466

As in the Julia CUDA manual, " On Windows, also make sure you have the Visual C++ redistributable installed." I downloaded the Visual C++ redistributable [2015 - 2019] and tried to install it.

It turns out that there is a more recent version [2015 - 2022] already installed on my system

which prevents the linked redistributable from being installed.
Is having the more recent redistributable installed okay?

Advice on how to address these errors would be appreciated.

Yes, that should be fine. If you don’t have it installed, you’ll run into The specified module could not be found errors during testing.

The base/examples test failures are probably unrelated. Try running each example in isolation (from the examples/ folder in the repository) to see which one fails.

Thanks for your reply and advice.

In the directory
C:\Users\. . .\.julia\packages\CUDA\oymHm\examples

I found the following

julia> include("hello_world.jl")
Greetings from block 2, thread 1!
Greetings from block 2, thread 2!
Greetings from block 1, thread 1!
Greetings from block 1, thread 2!
julia> include("pairwise.jl")
Test Passed
julia> include("peakflops.jl")
3.2044053e12
julia> include("vadd.jl")
Test Passed

As all the tests worked, is this the correct examples/ directory to run the tests?

Yes, but the subdirectories you see there contain other examples. Also, the test suite executes with --check-bounds=yes, in case that would cause something to fail.

Had a look and ran:

C:\Users\ . . .\ .julia\packages\CUDA\oymHm\examples\driver

julia> using CUDA

julia> include("vadd.jl")
Test Passed

C:\Users\ . . .\ .julia\packages\CUDA\oymHm\examples\wmma

julia> using CUDA

julia> include("high-level.jl")
PS C:\Users\ . . . \.julia\packages\CUDA\oymHm\examples\wmma>
julia> using CUDA

julia> include("low-level.jl")
PS C:\Users\ . . . \.julia\packages\CUDA\oymHm\examples\wmma>

Both scripts seem to crash, no error messages, and force an exit from Julia.

Also, the test suite executes with --check-bounds=yes, in case that would cause something to fail.

PS C:\Users\. . . > julia -t auto --check-bounds=no

Precompiling InteractiveUtils...
  3 dependencies successfully precompiled in 5 seconds
Precompiling REPL...
  3 dependencies successfully precompiled in 50 seconds. 3 already precompiled.
julia> using CUDA
Precompiling CUDA...
  96 dependencies successfully precompiled in 143 seconds. 4 already precompiled.
Precompiling REPLExt...
  1 dependency successfully precompiled in 8 seconds. 26 already precompiled.
Precompiling StyledStringsExt...
  1 dependency successfully precompiled in 1 seconds. 4 already precompiled.
(@v1.11) pkg> test CUDA
     Testing CUDA

results in the same errors in base/examples as listed in my original post.

That makes it hard to debug, of course. But it does seem like the source of the test suite failures.

In any case, if you don’t use WMMA, you shouldn’t run in this being problematic. Also, IIUC the wmma tests themselves seem to pass fine on your system, so maybe this is just a problem with the examples.

Thank you for your replies and advice, will now give CUDA a go and see if any issues arise in application.