Hello everyone
I encountered a problem while testing CUDA.jl, and the same issue persists after some attempts to install different versions of CUDA.jl.I don’t know why this happen.
I connect to a Linux server (Ubuntu 24.04.2 LTS) through VSCode, I should list some basic information first :
julia> CUDA.versioninfo()
CUDA toolchain:
- runtime 12.9, artifact installation
- driver 570.172.8 for 12.8
- compiler 12.9
CUDA libraries:
- CUBLAS: 12.9.1
- CURAND: 10.3.10
- CUFFT: 11.4.1
- CUSOLVER: 11.7.5
- CUSPARSE: 12.5.10
- CUPTI: 2025.2.1 (API 28.0.0)
- NVML: 12.0.0+570.172.8
Julia packages:
- CUDA: 5.8.3
- CUDA_Driver_jll: 13.0.1+0
- CUDA_Compiler_jll: 0.2.1+0
- CUDA_Runtime_jll: 0.19.1+0
Toolchain:
- Julia: 1.10.10
- LLVM: 15.0.7
4 devices:
0: NVIDIA GeForce RTX 4090 (sm_89, 792.562 MiB / 23.988 GiB available)
1: NVIDIA GeForce RTX 4090 (sm_89, 23.524 GiB / 23.988 GiB available)
2: NVIDIA GeForce RTX 4090 (sm_89, 23.524 GiB / 23.988 GiB available)
3: NVIDIA GeForce RTX 4090 (sm_89, 23.524 GiB / 23.988 GiB available)
julia> versioninfo()
Julia Version 1.10.10
Commit 95f30e51f41 (2025-06-27 09:51 UTC)
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 64 × INTEL(R) XEON(R) GOLD 6530
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, icelake-client)
Threads: 1 default, 0 interactive, 1 GC (on 64 virtual cores)
‘’’
‘’’
qtt@qt1:~$ nvidia-smi
Sun Oct 5 11:51:06 2025
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.172.08 Driver Version: 570.172.08 CUDA Version: 12.8 |
|-----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 Off | 00000000:27:00.0 Off | Off |
| 0% 29C P8 4W / 450W | 23300MiB / 24564MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+
| 1 NVIDIA GeForce RTX 4090 Off | 00000000:38:00.0 Off | Off |
| 0% 28C P8 13W / 450W | 3MiB / 24564MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+
| 2 NVIDIA GeForce RTX 4090 Off | 00000000:A8:00.0 Off | Off |
| 0% 31C P8 13W / 450W | 3MiB / 24564MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+
| 3 NVIDIA GeForce RTX 4090 Off | 00000000:B8:00.0 Off | Off |
| 0% 29C P8 19W / 450W | 3MiB / 24564MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 770765 C …e/qtt/julia-1.10.10/bin/julia 1672MiB |
| 0 N/A N/A 770767 C …e/qtt/julia-1.10.10/bin/julia 476MiB |
| 0 N/A N/A 770768 C …e/qtt/julia-1.10.10/bin/julia 564MiB |
| 0 N/A N/A 770770 C …e/qtt/julia-1.10.10/bin/julia 5616MiB |
| 0 N/A N/A 770771 C …e/qtt/julia-1.10.10/bin/julia 546MiB |
| 0 N/A N/A 770773 C …e/qtt/julia-1.10.10/bin/julia 480MiB |
| 0 N/A N/A 770774 C …e/qtt/julia-1.10.10/bin/julia 670MiB |
| 0 N/A N/A 771159 C …e/qtt/julia-1.10.10/bin/julia 684MiB |
| 0 N/A N/A 774737 C …e/qtt/julia-1.10.10/bin/julia 542MiB |
| 0 N/A N/A 1286059 C …e/qtt/julia-1.10.10/bin/julia 4356MiB |
| 0 N/A N/A 1286060 C …e/qtt/julia-1.10.10/bin/julia 5646MiB |
| 0 N/A N/A 1286253 C …e/qtt/julia-1.10.10/bin/julia 866MiB |
| 0 N/A N/A 1290232 C …e/qtt/julia-1.10.10/bin/julia 544MiB |
| 0 N/A N/A 1290741 C …e/qtt/julia-1.10.10/bin/julia 558MiB |
±----------------------------------------------------------------------------------------+
Then are the results obtained from testing CUDA:
(@v1.10) pkg> test CUDA
Testing CUDA
Status/tmp/jl_TPkd71/Project.toml
[621f4979] AbstractFFTs v1.5.0
[79e6a3ab] Adapt v4.4.0
[ab4f0b2a] BFloat16s v0.5.1
⌃ [052768ef] CUDA v5.8.3
[d360d2e6] ChainRulesCore v1.26.0
[864edb3b] DataStructures v0.19.1
[7a1cc6ca] FFTW v1.10.0
[0c68f7d7] GPUArrays v11.2.5
[61eb1bfa] GPUCompiler v1.6.2
[a98d9a8b] Interpolations v0.16.2
⌃ [033835bb] JLD2 v0.5.15
[63c18a36] KernelAbstractions v0.9.38
[5da4648a] NVTX v1.0.1
[a0a7dd2c] SparseMatricesCSR v0.6.9
[276daf66] SpecialFunctions v2.6.1
[90137ffa] StaticArrays v1.9.15
[4ee394cb] CUDA_Driver_jll v13.0.1+0
[76a88914] CUDA_Runtime_jll v0.19.1+0
[ade2ca70] Dates
[8ba89e20] Distributed
[b77e0a4c] InteractiveUtils
[37e2e46d] LinearAlgebra
[44cfe95a] Pkg v1.10.0
[de0858da] Printf
[3fa0cd96] REPL
[9a3f8284] Random
[2f01184e] SparseArrays v1.10.0
[10745b16] Statistics v1.10.0
[8dfed614] Test
Status/tmp/jl_TPkd71/Manifest.toml
[621f4979] AbstractFFTs v1.5.0
[79e6a3ab] Adapt v4.4.0
[a9b6321e] Atomix v1.1.2
[13072b0f] AxisAlgorithms v1.1.0
[ab4f0b2a] BFloat16s v0.5.1
[fa961155] CEnum v0.5.0
⌃ [052768ef] CUDA v5.8.3
[1af6417a] CUDA_Runtime_Discovery v1.0.0
[d360d2e6] ChainRulesCore v1.26.0
[3da002f7] ColorTypes v0.12.1
[5ae59095] Colors v0.13.1
[34da2185] Compat v4.18.1
[a8cc5b0e] Crayons v4.1.1
[9a962f9c] DataAPI v1.16.0
[a93c6f00] DataFrames v1.8.0
[864edb3b] DataStructures v0.19.1
[e2d170a0] DataValueInterfaces v1.0.0
[ffbed154] DocStringExtensions v0.9.5
[e2ba6199] ExprTools v0.1.10
[7a1cc6ca] FFTW v1.10.0
[5789e2e9] FileIO v1.17.0
[53c48c17] FixedPointNumbers v0.8.5
[0c68f7d7] GPUArrays v11.2.5
[46192b85] GPUArraysCore v0.2.0
[61eb1bfa] GPUCompiler v1.6.2
[096a3bc2] GPUToolbox v0.3.0
[076d061b] HashArrayMappedTries v0.2.0
[842dd82b] InlineStrings v1.4.5
[a98d9a8b] Interpolations v0.16.2
[41ab1584] InvertedIndices v1.3.1
[92d709cd] IrrationalConstants v0.2.4
[82899510] IteratorInterfaceExtensions v1.0.0
⌃ [033835bb] JLD2 v0.5.15
[692b3bcd] JLLWrappers v1.7.1
[63c18a36] KernelAbstractions v0.9.38
[929cbde3] LLVM v9.4.2
[8b046642] LLVMLoopInfo v1.0.0
[b964fa9f] LaTeXStrings v1.4.0
[2ab3a3ac] LogExpFunctions v0.3.29
[1914dd2f] MacroTools v0.5.16
[e1d29d7a] Missings v1.2.0
[5da4648a] NVTX v1.0.1
[6fe1bfb0] OffsetArrays v1.17.0
[bac558e1] OrderedCollections v1.8.1
[2dfb63ee] PooledArrays v1.4.3
⌅ [aea7be01] PrecompileTools v1.2.1
[21216c6a] Preferences v1.5.0
⌅ [08abe8d2] PrettyTables v2.4.0
[74087812] Random123 v1.7.1
[e6cf234a] RandomNumbers v1.6.0
[c84ed2f1] Ratios v0.4.5
[189a3867] Reexport v1.2.2
[ae029012] Requires v1.3.1
[7e506255] ScopedValues v1.5.0
[6c6a2e73] Scratch v1.3.0
[91c51154] SentinelArrays v1.4.8
[a2af1166] SortingAlgorithms v1.2.2
[a0a7dd2c] SparseMatricesCSR v0.6.9
[276daf66] SpecialFunctions v2.6.1
[90137ffa] StaticArrays v1.9.15
[1e83bf80] StaticArraysCore v1.4.3
[892a3eda] StringManipulation v0.4.1
[3783bdb8] TableTraits v1.0.1
[bd369af6] Tables v1.12.1
[e689c965] Tracy v0.1.6
[3bb67fe8] TranscodingStreams v0.11.3
[013be700] UnsafeAtomics v0.3.0
[efce3f68] WoodburyMatrices v1.0.0
[d1e2174e] CUDA_Compiler_jll v0.2.1+0
[4ee394cb] CUDA_Driver_jll v13.0.1+0
[76a88914] CUDA_Runtime_jll v0.19.1+0
[f5851436] FFTW_jll v3.3.11+0
[1d5cc7b8] IntelOpenMP_jll v2025.2.0+0
[9c1d0b0a] JuliaNVTXCallbacks_jll v0.2.1+0
[dad2f222] LLVMExtra_jll v0.0.37+2
[ad6e5548] LibTracyClient_jll v0.9.1+6
[856f044c] MKL_jll v2025.2.0+0
[e98f9f5b] NVTX_jll v3.2.2+0
[efe28fd5] OpenSpecFun_jll v0.5.6+0
[1e29f10c] demumble_jll v1.3.0+0
[1317d2d5] oneTBB_jll v2022.0.0+0
[0dad84c5] ArgTools v1.1.1
[56f22d72] Artifacts
[2a0f44e3] Base64
[ade2ca70] Dates
[8ba89e20] Distributed
[f43a241f] Downloads v1.6.0
[7b1f6079] FileWatching
[9fa8497b] Future
[b77e0a4c] InteractiveUtils
[4af54fe1] LazyArtifacts
[b27032c2] LibCURL v0.6.4
[76f85450] LibGit2
[8f399da3] Libdl
[37e2e46d] LinearAlgebra
[56ddb016] Logging
[d6f4376e] Markdown
[a63ad114] Mmap
[ca575930] NetworkOptions v1.2.0
[44cfe95a] Pkg v1.10.0
[de0858da] Printf
[3fa0cd96] REPL
[9a3f8284] Random
[ea8e919c] SHA v0.7.0
[9e88b42a] Serialization
[1a1011a3] SharedArrays
[6462fe0b] Sockets
[2f01184e] SparseArrays v1.10.0
[10745b16] Statistics v1.10.0
[4607b0f0] SuiteSparse
[fa267f1f] TOML v1.0.3
[a4e569a6] Tar v1.10.0
[8dfed614] Test
[cf7118a7] UUIDs
[4ec0a83e] Unicode
[e66e0078] CompilerSupportLibraries_jll v1.1.1+0
[deac9b47] LibCURL_jll v8.4.0+0
[e37daf67] LibGit2_jll v1.6.4+0
[29816b5a] LibSSH2_jll v1.11.0+1
[c8ffd9c3] MbedTLS_jll v2.28.2+1
[14a3606d] MozillaCACerts_jll v2023.1.10
[4536629a] OpenBLAS_jll v0.3.23+4
[05823500] OpenLibm_jll v0.8.5+0
[bea87d4a] SuiteSparse_jll v7.2.1+1
[83775a58] Zlib_jll v1.2.13+1
[8e850b90] libblastrampoline_jll v5.11.0+0
[8e850ede] nghttp2_jll v1.52.0+1
[3f19e933] p7zip_jll v17.4.0+2
Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading.
Precompiling packages finished.
105 dependencies successfully precompiled in 88 seconds. 7 already precompiled.
1 dependency had output during precompilation:
┌ MKL_jll
│ Downloading artifact: IntelOpenMP
└
Testing Running tests…
┌ Info: System information:
│ CUDA toolchain:
│ - runtime 12.9, artifact installation
│ - driver 570.172.8 for 12.8
│ - compiler 12.9
│
│ CUDA libraries:
│ - CUBLAS: 12.9.1
│ - CURAND: 10.3.10
│ - CUFFT: 11.4.1
│ - CUSOLVER: 11.7.5
│ - CUSPARSE: 12.5.10
│ - CUPTI: 2025.2.1 (API 28.0.0)
│ - NVML: 12.0.0+570.172.8
│
│ Julia packages:
│ - CUDA: 5.8.3
│ - CUDA_Driver_jll: 13.0.1+0
│ - CUDA_Compiler_jll: 0.2.1+0
│ - CUDA_Runtime_jll: 0.19.1+0
│
│ Toolchain:
│ - Julia: 1.10.10
│ - LLVM: 15.0.7
│
│ 4 devices:
│ 0: NVIDIA GeForce RTX 4090 (sm_89, 792.562 MiB / 23.988 GiB available)
│ 1: NVIDIA GeForce RTX 4090 (sm_89, 23.524 GiB / 23.988 GiB available)
│ 2: NVIDIA GeForce RTX 4090 (sm_89, 23.524 GiB / 23.988 GiB available)
└ 3: NVIDIA GeForce RTX 4090 (sm_89, 23.524 GiB / 23.988 GiB available)
[ Info: Testing using device 0 (NVIDIA GeForce RTX 4090). To change this, specify the--gpu
argument to the tests, or set theCUDA_VISIBLE_DEVICES
environment variable.
[ Info: Running 1 tests in parallel. If this is too many, specify the--jobs
argument to the tests, or set theJULIA_CPU_THREADS
environment variable.
| | ---------------- GPU ---------------- | ---------------- CPU ---------------- |
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB) | GC (s) | GC % | Alloc (MB) | RSS (MB) |
core/initialization (2) | 5.57 | 0.00 | 0.0 | 0.00 | 416.00 | 0.01 | 0.1 | 64.24 | 1478.82 |
gpuarrays/reductions/sum prod (3) | 157.15 | 0.02 | 0.0 | 3.24 | 464.00 | 3.86 | 2.5 | 10638.09 | 3235.04 |
gpuarrays/reductions/reduce (3) | 90.22 | 0.01 | 0.0 | 1.53 | 468.00 | 2.00 | 2.2 | 8644.59 | 3901.03 |
gpuarrays/reductions/mapreducedim! (3) | 64.31 | 0.00 | 0.0 | 1.54 | 470.00 | 1.06 | 1.7 | 4089.67 | 4448.71 |
gpuarrays/broadcasting (3) | 152.41 | 0.01 | 0.0 | 2.00 | 476.00 | 2.49 | 1.6 | 9933.03 | 5858.54 |gpuarrays/reductions/== isequal (3) | 51.18 | 0.01 | 0.0 | 1.07 | 478.00 | 1.33 | 2.6 | 5287.18 | 6211.80 |
gpuarrays/base (3) | 24.77 | 0.00 | 0.0 | 8.90 | 480.00 | 0.80 | 3.2 | 2581.06 | 6547.80 |
gpuarrays/random (3) | 24.60 | 0.00 | 0.0 | 0.03 | 480.00 | 0.20 | 0.8 | 992.90 | 6644.76 |
gpuarrays/vectors (3) | 0.28 | 0.00 | 0.0 | 0.00 | 480.00 | 0.00 | 0.0 | 16.54 | 6644.76 |
gpuarrays/ext/jld2 (3) | 7.79 | 0.00 | 0.0 | 0.00 | 480.00 | 0.05 | 0.7 | 305.04 | 6676.76 |
gpuarrays/constructors (3) | 22.13 | 0.00 | 0.0 | 0.65 | 480.00 | 0.30 | 1.3 | 1147.61 | 6768.30 |
gpuarrays/reductions/mapreduce (3) | 27.65 | 0.01 | 0.0 | 1.83 | 484.00 | 0.45 | 1.6 | 2081.10 | 6937.31 |
gpuarrays/statistics (3) | 55.35 | 0.00 | 0.0 | 1.51 | 498.00 | 1.02 | 1.8 | 3624.64 | 7721.31 |
gpuarrays/linalg/norm (3) | 121.30 | 0.01 | 0.0 | 0.02 | 502.00 | 1.63 | 1.3 | 7316.93 | 10480.56 |
gpuarrays/linalg/NaN_false (3) | 14.49 | 0.00 | 0.0 | 0.00 | 502.00 | 0.12 | 0.8 | 789.30 | 10939.57 |
gpuarrays/math/intrinsics (3) | 1.91 | 0.00 | 0.0 | 0.00 | 502.00 | 0.00 | 0.0 | 89.65 | 10939.57 |
gpuarrays/linalg/mul!/matrix-matrix (3) | 81.99 | 0.01 | 0.0 | 0.13 | 506.00 | 1.39 | 1.7 | 5560.40 | 11332.05 |
gpuarrays/reductions/mapreducedim!_large (3) | 9.35 | 0.00 | 0.0 | 818.38 | 698.00 | 0.16 | 1.7 | 1975.14 | 12172.28 |
gpuarrays/uniformscaling (3) | 6.17 | 0.00 | 0.0 | 0.01 | 698.00 | 0.00 | 0.0 | 273.53 | 12172.28 |
gpuarrays/reductions/minimum maximum extrema (3) | 160.92 | 0.01 | 0.0 | 2.19 | 704.00 | 2.68 | 1.7 | 10382.93 | 13841.21 |
gpuarrays/reductions/any all count (3) | 7.01 | 0.00 | 0.0 | 0.00 | 704.00 | 0.09 | 1.2 | 542.95 | 13858.21 |
gpuarrays/indexing multidimensional (3) | 41.29 | 0.00 | 0.0 | 2.07 | 774.00 | 0.70 | 1.7 | 2589.85 | 14222.71 |
gpuarrays/indexing find (3) | 19.74 | 0.00 | 0.0 | 0.13 | 774.00 | 0.46 | 2.3 | 1615.46 | 14508.00 |
gpuarrays/linalg/mul!/vector-matrix (3) | 49.20 | 0.00 | 0.0 | 0.02 | 776.00 | 0.90 | 1.8 | 3566.35 | 14734.57 |
gpuarrays/math/power (3) | 11.65 | 0.00 | 0.0 | 0.01 | 776.00 | 0.30 | 2.6 | 1338.47 | 14774.07 |
gpuarrays/linalg (3) | 145.87 | 0.45 | 0.3 | 5406.77 | 752.00 | 3.60 | 2.5 | 33870.12 | 20590.75 |
gpuarrays/reductions/reducedim! (3) | 0.65 | 0.00 | 0.2 | 1.03 | 752.00 | 0.00 | 0.0 | 21.84 | 20590.75 |
gpuarrays/indexing scalar (3) | 8.11 | 0.00 | 0.0 | 0.01 | 754.00 | 0.14 | 1.8 | 519.94 | 20590.75 |
gpuarrays/alloc cache (3) | 1.26 | 0.00 | 0.0 | 0.00 | 754.00 | 0.00 | 0.0 | 111.79 | 20590.75 |
libraries/cusparse (3) | 103.87 | 0.03 | 0.0 | 12.95 | 756.00 | 1.42 | 1.4 | 5932.16 | 20590.75 |
From worker 3: ERROR: a exception was thrown during kernel execution on thread (33, 1, 1) in block (2, 1, 1).
From worker 3: Stacktrace not available, run Julia on debug level 2 for more details (by passing -g2 to the executable).
From worker 3:
From worker 3: ERROR: a exception was thrown during kernel execution on thread (33, 1, 1) in block (2, 1, 1).
From worker 3: Stacktrace not available, run Julia on debug level 2 for more details (by passing -g2 to the executable).
From worker 3:
From worker 3: ERROR: a exception was thrown during kernel execution on thread (1, 1, 1) in block (2, 1, 1).
From worker 3: Stacktrace not available, run Julia on debug level 2 for more details (by passing -g2 to the executable).
From worker 3:
From worker 3: ERROR: a exception was thrown during kernel execution on thread (1, 1, 1) in block (2, 1, 1).
From worker 3: Stacktrace not available, run Julia on debug level 2 for more details (by passing -g2 to the executable).
From worker 3:
libraries/cusolver/dense (3) | failed at 2025-10-05T11:21:03.486
From worker 4: WARNING: Method definition var"#3962#kernel"(Any) in module Main at /home/qtt/.julia/packages/CUDA/Wfi8S/test/core/execution.jl:358 overwritten at /home/qtt/.julia/packages/CUDA/Wfi8S/test/core/execution.jl:366.
core/execution (4) | 43.10 | 0.01 | 0.0 | 0.02 | 656.00 | 1.10 | 2.6 | 3123.82 | 1527.50 |
base/array (4) | failed at 2025-10-05T11:23:12.522
core/cudadrv (5) | 17.04 | 0.00 | 0.0 | 0.00 | 460.00 | 0.21 | 1.3 | 1026.63 | 1478.82 |
libraries/cublas/level2 (6) | 56.45 | 0.01 | 0.0 | 1.35 | 538.00 | 1.60 | 2.8 | 4080.19 | 1703.36 |
libraries/cublas/extensions (6) | 47.60 | 0.01 | 0.0 | 36.49 | 540.00 | 1.53 | 3.2 | 3802.86 | 2064.80 |
libraries/cusparse/generic (6) | 69.48 | 0.04 | 0.1 | 14.32 | 546.00 | 1.10 | 1.6 | 3926.32 | 2525.03 |
libraries/cublas/level3 (6) | 49.36 | 0.01 | 0.0 | 5.71 | 548.00 | 0.64 | 1.3 | 2426.10 | 2876.65 |
From worker 6: do_apply at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/builtins.c:768
From worker 6: #118 at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:310
From worker 6: run_work_thunk at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:70
From worker 6: #117 at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:310
From worker 6: unknown function (ip: 0x7466a732a282)
From worker 6: _jl_invoke at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
From worker 6: ijl_apply_generic at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:3077
From worker 6: jl_apply at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined]
From worker 6: start_task at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/task.c:1256
From worker 6: error in running finalizer: CUDA.CuError(code=CUDA.cudaError_enum(0x000002bc))
From worker 6: throw_api_error at /home/qtt/.julia/packages/CUDA/Wfi8S/lib/cudadrv/libcuda.jl:30
From worker 6: check at /home/qtt/.julia/packages/CUDA/Wfi8S/lib/cudadrv/libcuda.jl:37 [inlined]
From worker 6: cuModuleUnload at /home/qtt/.julia/packages/GPUToolbox/XaIIx/src/ccalls.jl:33 [inlined]
From worker 6: #989 at /home/qtt/.julia/packages/CUDA/Wfi8S/lib/cudadrv/module.jl:92 [inlined]
From worker 6: #context!#1025 at /home/qtt/.julia/packages/CUDA/Wfi8S/lib/cudadrv/state.jl:168 [inlined]
From worker 6: context! at /home/qtt/.julia/packages/CUDA/Wfi8S/lib/cudadrv/state.jl:163 [inlined]
From worker 6: unsafe_unload! at /home/qtt/.julia/packages/CUDA/Wfi8S/lib/cudadrv/module.jl:91
From worker 6: jfptr_unsafe_unloadNOT._15636 at /home/qtt/.julia/compiled/v1.10/CUDA/oWw5k_aU2nd.so (unknown line)
From worker 6: _jl_invoke at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
From worker 6: ijl_apply_generic at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:3077
From worker 6: run_finalizer at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gc.c:318
From worker 6: jl_gc_run_finalizers_in_list at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gc.c:408
From worker 6: run_finalizers at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gc.c:454
From worker 6: ijl_atexit_hook at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/init.c:299
From worker 6: ijl_exit at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/init.c:207
From worker 6: exit at ./initdefs.jl:28 [inlined]
From worker 6: exit at ./initdefs.jl:29
From worker 6: jfptr_exit_79417.1 at /home/qtt/julia-1.10.10/lib/julia/sys.so (unknown line)
From worker 6: _jl_invoke at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
From worker 6: ijl_apply_generic at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:3077
From worker 6: jl_apply at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined]
From worker 6: jl_f__call_latest at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/builtins.c:812
From worker 6: #invokelatest#2 at ./essentials.jl:892 [inlined]
…libraries/cusparse/interfaces (7) | 161.17 | 0.10 | 0.1 | 47.12 | 462.00 | 3.73 | 2.3 | 10386.80 | 1982.10 |
core/device/intrinsics/wmma (7) | 60.04 | 0.01 | 0.0 | 0.63 | 466.00 | 0.94 | 1.6 | 4089.82 | 2973.38 |
libraries/cufft (7) | 76.81 | 0.01 | 0.0 | 198.03 | 566.00 | 1.80 | 2.3 | 5875.49 | 3464.22 |
core/device/intrinsics/atomics (7) | 18.56 | 0.00 | 0.0 | 0.00 | 566.00 | 0.20 | 1.1 | 943.82 | 3845.84 |
…
From worker 7: context! at /home/qtt/.julia/packages/CUDA/Wfi8S/lib/cudadrv/state.jl:163 [inlined]
From worker 7: unsafe_unload! at /home/qtt/.julia/packages/CUDA/Wfi8S/lib/cudadrv/module.jl:91
From worker 7: jfptr_unsafe_unloadNOT._15636 at /home/qtt/.julia/compiled/v1.10/CUDA/oWw5k_aU2nd.so (unknown line)
From worker 7: _jl_invoke at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
From worker 7: ijl_apply_generic at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:3077
From worker 7: run_finalizer at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gc.c:318
From worker 7: jl_gc_run_finalizers_in_list at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gc.c:408
From worker 7: run_finalizers at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gc.c:454
From worker 7: ijl_atexit_hook at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/init.c:299
From worker 7: ijl_exit at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/init.c:207
From worker 7: exit at ./initdefs.jl:28 [inlined]
From worker 7: exit at ./initdefs.jl:29
From worker 7: jfptr_exit_79417.1 at /home/qtt/julia-1.10.10/lib/julia/sys.so (unknown line)
From worker 7: _jl_invoke at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
From worker 7: ijl_apply_generic at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:3077
From worker 7: jl_apply at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined]
From worker 7: jl_f__call_latest at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/builtins.c:812
From worker 7: #invokelatest#2 at ./essentials.jl:892 [inlined]
From worker 7: invokelatest at ./essentials.jl:889
From worker 7: unknown function (ip: 0x7a85fdd2c265)
From worker 7: _jl_invoke at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
From worker 7: ijl_apply_generic at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:3077
From worker 7: jl_apply at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined]
From worker 7: do_apply at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/builtins.c:768
From worker 7: #118 at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:310
From worker 7: run_work_thunk at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:70
From worker 7: #117 at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/usr/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:310
From worker 7: unknown function (ip: 0x7a85fdd28282)
From worker 7: _jl_invoke at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:2895 [inlined]
From worker 7: ijl_apply_generic at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/gf.c:3077
From worker 7: jl_apply at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/julia.h:1982 [inlined]
From worker 7: start_task at /cache/build/builder-amdci5-7/julialang/julia-release-1-dot-10/src/task.c:1256
From worker 7: error in running finalizer: CUDA.CuError(code=CUDA.cudaError_enum(0x000002bc))
From worker 7: throw_api_error at /home/qtt/.julia/packages/CUDA/Wfi8S/lib/cudadrv/libcuda.jl:30
From worker 7: check at /home/qtt/.julia/packages/CUDA/Wfi8S/lib/cudadrv/libcuda.jl:37 [inlined]
From worker 7: cuModuleUnload at /home/qtt/.julia/packages/GPUToolbox/XaIIx/src/ccalls.jl:33 [inlined]
From worker 7: #989 at /home/qtt/.julia/packages/CUDA/Wfi8S/lib/cudadrv/module.jl:92 [inlined]
From worker 7: #context!#1025 at /home/qtt/.julia/packages/CUDA/Wfi8S/lib/cudadrv/state.jl:168 [inlined]
┌ Warning: Failed to gracefully kill worker 7, sending SIGTERM
└ @ Distributed ~/julia-1.10.10/share/julia/stdlib/v1.10/Distributed/src/managers.jl:745
From worker 7: exit at ./initdefs.jl:28 [inlined]
From worker 7: exit at ./initdefs.jl:29
From worker 7:
From worker 7: [1683147] signal (15): Terminated
From worker 7: in expression starting at none:0
ERROR: ┌ Warning: Worker 7 ignored SIGTERM, sending SIGKILL
└ @ Distributed ~/julia-1.10.10/share/julia/stdlib/v1.10/Distributed/src/managers.jl:750
LoadError: TaskFailedException
nested task error: rmprocs: pids [7] not terminated after 30 seconds.
Stacktrace:
[1] _rmprocs(pids::Vector{Int64}, waitfor::Int64)
@ Distributed ~/julia-1.10.10/share/julia/stdlib/v1.10/Distributed/src/cluster.jl:1069
[2] rmprocs(pids::Int64; waitfor::Int64)
@ Distributed ~/julia-1.10.10/share/julia/stdlib/v1.10/Distributed/src/cluster.jl:1037
[3] rmprocs
@ ~/julia-1.10.10/share/julia/stdlib/v1.10/Distributed/src/cluster.jl:1028 [inlined]
[4] (::var"#recycle_worker#57")(p::Int64)
@ Main ~/.julia/packages/CUDA/Wfi8S/test/runtests.jl:353
[5] (::var"#52#59"{Dict{String, DateTime}, Task, var"#recycle_worker#57"})()
@ Main ~/.julia/packages/CUDA/Wfi8S/test/runtests.jl:395
Stacktrace:
[1] sync_end(c::Channel{Any})
@ Base ./task.jl:455
[2] macro expansion
@ task.jl:487 [inlined]
[3] top-level scope
@ ~/.julia/packages/CUDA/Wfi8S/test/runtests.jl:344
[4] include(fname::String)
@ Base.MainInclude ./client.jl:494
[5] top-level scope
@ none:6
in expression starting at /home/qtt/.julia/packages/CUDA/Wfi8S/test/runtests.jl:314
ERROR: Package CUDA errored during testing
I omitted a lot of information about “from worker 6” and “from worker 7” above because it was too long to list here.
This problem has been bothering me for many days. I would be very grateful if you could help me.
If there is anything wrong with the way I am asking questions, I sincerely ask everyone to point it out. This is my first time asking a question in this community. Thank you.