So increaseing the limit for memory mapping fixed the problem I had before, and now the test fails with
(@v1.10) pkg> test CUDA
Testing CUDA
Status `/tmp/jl_hViKlC/Project.toml`
[621f4979] AbstractFFTs v1.5.0
[79e6a3ab] Adapt v4.4.0
[ab4f0b2a] BFloat16s v0.6.0
[052768ef] CUDA v5.9.5
[d360d2e6] ChainRulesCore v1.26.0
[864edb3b] DataStructures v0.19.3
[7a1cc6ca] FFTW v1.10.0
[0c68f7d7] GPUArrays v11.3.1
[61eb1bfa] GPUCompiler v1.7.5
⌃ [a98d9a8b] Interpolations v0.15.1
[033835bb] JLD2 v0.6.3
[63c18a36] KernelAbstractions v0.9.39
[5da4648a] NVTX v1.0.1
[a0a7dd2c] SparseMatricesCSR v0.6.9
[276daf66] SpecialFunctions v2.6.1
[90137ffa] StaticArrays v1.9.15
[4ee394cb] CUDA_Driver_jll v13.0.2+0
[76a88914] CUDA_Runtime_jll v0.19.2+0
[ade2ca70] Dates
[8ba89e20] Distributed
[b77e0a4c] InteractiveUtils
[37e2e46d] LinearAlgebra
[44cfe95a] Pkg v1.10.0
[de0858da] Printf
[3fa0cd96] REPL
[9a3f8284] Random
[2f01184e] SparseArrays v1.10.0
[10745b16] Statistics v1.10.0
[8dfed614] Test
Status `/tmp/jl_hViKlC/Manifest.toml`
[621f4979] AbstractFFTs v1.5.0
[79e6a3ab] Adapt v4.4.0
[a9b6321e] Atomix v1.1.2
[13072b0f] AxisAlgorithms v1.1.0
[ab4f0b2a] BFloat16s v0.6.0
[fa961155] CEnum v0.5.0
[052768ef] CUDA v5.9.5
[1af6417a] CUDA_Runtime_Discovery v1.0.0
[d360d2e6] ChainRulesCore v1.26.0
[0b6fb165] ChunkCodecCore v1.0.0
[4c0bbee4] ChunkCodecLibZlib v1.0.0
[55437552] ChunkCodecLibZstd v1.0.0
[3da002f7] ColorTypes v0.12.1
[5ae59095] Colors v0.13.1
[34da2185] Compat v4.18.1
[a8cc5b0e] Crayons v4.1.1
[9a962f9c] DataAPI v1.16.0
[a93c6f00] DataFrames v1.8.1
[864edb3b] DataStructures v0.19.3
[e2d170a0] DataValueInterfaces v1.0.0
[ffbed154] DocStringExtensions v0.9.5
[e2ba6199] ExprTools v0.1.10
[7a1cc6ca] FFTW v1.10.0
[5789e2e9] FileIO v1.17.1
[53c48c17] FixedPointNumbers v0.8.5
[0c68f7d7] GPUArrays v11.3.1
[46192b85] GPUArraysCore v0.2.0
[61eb1bfa] GPUCompiler v1.7.5
[096a3bc2] GPUToolbox v1.0.0
[076d061b] HashArrayMappedTries v0.2.0
[842dd82b] InlineStrings v1.4.5
⌃ [a98d9a8b] Interpolations v0.15.1
[41ab1584] InvertedIndices v1.3.1
[92d709cd] IrrationalConstants v0.2.6
[82899510] IteratorInterfaceExtensions v1.0.0
[033835bb] JLD2 v0.6.3
[692b3bcd] JLLWrappers v1.7.1
[63c18a36] KernelAbstractions v0.9.39
[929cbde3] LLVM v9.4.4
[8b046642] LLVMLoopInfo v1.0.0
[b964fa9f] LaTeXStrings v1.4.0
[2ab3a3ac] LogExpFunctions v0.3.29
[1914dd2f] MacroTools v0.5.16
[e1d29d7a] Missings v1.2.0
[5da4648a] NVTX v1.0.1
[6fe1bfb0] OffsetArrays v1.17.0
[bac558e1] OrderedCollections v1.8.1
[2dfb63ee] PooledArrays v1.4.3
⌅ [aea7be01] PrecompileTools v1.2.1
[21216c6a] Preferences v1.5.0
[08abe8d2] PrettyTables v3.1.2
[74087812] Random123 v1.7.1
[e6cf234a] RandomNumbers v1.6.0
[c84ed2f1] Ratios v0.4.5
[189a3867] Reexport v1.2.2
[ae029012] Requires v1.3.1
[7e506255] ScopedValues v1.5.0
[6c6a2e73] Scratch v1.3.0
[91c51154] SentinelArrays v1.4.8
[a2af1166] SortingAlgorithms v1.2.2
[a0a7dd2c] SparseMatricesCSR v0.6.9
[276daf66] SpecialFunctions v2.6.1
[90137ffa] StaticArrays v1.9.15
[1e83bf80] StaticArraysCore v1.4.4
[892a3eda] StringManipulation v0.4.2
[3783bdb8] TableTraits v1.0.1
[bd369af6] Tables v1.12.1
[e689c965] Tracy v0.1.6
[013be700] UnsafeAtomics v0.3.0
[efce3f68] WoodburyMatrices v1.0.0
[d1e2174e] CUDA_Compiler_jll v0.3.0+0
[4ee394cb] CUDA_Driver_jll v13.0.2+0
[76a88914] CUDA_Runtime_jll v0.19.2+0
[f5851436] FFTW_jll v3.3.11+0
[1d5cc7b8] IntelOpenMP_jll v2025.2.0+0
[9c1d0b0a] JuliaNVTXCallbacks_jll v0.2.1+0
[dad2f222] LLVMExtra_jll v0.0.38+0
[ad6e5548] LibTracyClient_jll v0.9.1+6
[856f044c] MKL_jll v2025.2.0+0
[e98f9f5b] NVTX_jll v3.2.2+0
[efe28fd5] OpenSpecFun_jll v0.5.6+0
[3161d3a3] Zstd_jll v1.5.7+1
[1e29f10c] demumble_jll v1.3.0+0
[1317d2d5] oneTBB_jll v2022.0.0+1
[0dad84c5] ArgTools v1.1.1
[56f22d72] Artifacts
[2a0f44e3] Base64
[ade2ca70] Dates
[8ba89e20] Distributed
[f43a241f] Downloads v1.6.0
[7b1f6079] FileWatching
[9fa8497b] Future
[b77e0a4c] InteractiveUtils
[4af54fe1] LazyArtifacts
[b27032c2] LibCURL v0.6.4
[76f85450] LibGit2
[8f399da3] Libdl
[37e2e46d] LinearAlgebra
[56ddb016] Logging
[d6f4376e] Markdown
[a63ad114] Mmap
[ca575930] NetworkOptions v1.2.0
[44cfe95a] Pkg v1.10.0
[de0858da] Printf
[3fa0cd96] REPL
[9a3f8284] Random
[ea8e919c] SHA v0.7.0
[9e88b42a] Serialization
[1a1011a3] SharedArrays
[6462fe0b] Sockets
[2f01184e] SparseArrays v1.10.0
[10745b16] Statistics v1.10.0
[4607b0f0] SuiteSparse
[fa267f1f] TOML v1.0.3
[a4e569a6] Tar v1.10.0
[8dfed614] Test
[cf7118a7] UUIDs
[4ec0a83e] Unicode
[e66e0078] CompilerSupportLibraries_jll v1.1.1+0
[deac9b47] LibCURL_jll v8.4.0+0
[e37daf67] LibGit2_jll v1.6.4+0
[29816b5a] LibSSH2_jll v1.11.0+1
[c8ffd9c3] MbedTLS_jll v2.28.2+1
[14a3606d] MozillaCACerts_jll v2023.1.10
[4536629a] OpenBLAS_jll v0.3.23+4
[05823500] OpenLibm_jll v0.8.5+0
[bea87d4a] SuiteSparse_jll v7.2.1+1
[83775a58] Zlib_jll v1.2.13+1
[8e850b90] libblastrampoline_jll v5.11.0+0
[8e850ede] nghttp2_jll v1.52.0+1
[3f19e933] p7zip_jll v17.4.0+2
Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading.
Testing Running tests...
┌ Info: System information:
│ CUDA toolchain:
│ - runtime 12.6, local installation
│ - driver 565.57.1 for 13.0
│ - compiler 12.9
│
│ CUDA libraries:
│ - CUBLAS: 12.6.3
│ - CURAND: 10.3.7
│ - CUFFT: 11.3.0
│ - CUSOLVER: 11.7.1
│ - CUSPARSE: 12.5.4
│ - CUPTI: 2024.3.2 (API 12.6.0)
│ - NVML: 12.0.0+565.57.1
│
│ Julia packages:
│ - CUDA: 5.9.5
│ - CUDA_Driver_jll: 13.0.2+0
│ - CUDA_Compiler_jll: 0.3.0+0
│ - CUDA_Runtime_jll: 0.19.2+0
│ - CUDA_Runtime_Discovery: 1.0.0
│
│ Toolchain:
│ - Julia: 1.10.10
│ - LLVM: 15.0.7
│
│ Environment:
│ - JULIA_CUDA_MEMORY_POOL: none
│ - JULIA_CUDA_USE_BINARYBUILDER: false
│
│ Preferences:
│ - CUDA_Runtime_jll.version: 12.6
│ - CUDA_Runtime_jll.local: true
│
│ 1 device:
└ 0: NVIDIA GH200 120GB (sm_90, 94.997 GiB / 95.577 GiB available)
[ Info: Testing using device 0 (NVIDIA GH200 120GB). To change this, specify the `--gpu` argument to the tests, or set the `CUDA_VISIBLE_DEVICES` environment variable.
[ Info: Running 47 tests in parallel. If this is too many, specify the `--jobs` argument to the tests, or set the `JULIA_CPU_THREADS` environment variable.
┌ Warning: Running tests on a GPU in exclusive mode; reducing parallelism to 1.
└ @ Main /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/runtests.jl:181
| | ---------------- GPU ---------------- | ---------------- CPU ---------------- |
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB) | GC (s) | GC % | Alloc (MB) | RSS (MB) |
core/initialization (2) | 3.56 | 0.00 | 0.0 | 0.00 | 558.00 | 0.01 | 0.2 | 61.43 | 1121.62 |
gpuarrays/reductions/sum prod (3) | 108.53 | 0.03 | 0.0 | 3.24 | 630.00 | 3.32 | 3.1 | 11212.63 | 3911.00 |
gpuarrays/reductions/reduce (3) | 63.61 | 0.02 | 0.0 | 1.53 | 634.00 | 1.69 | 2.7 | 9181.62 | 4919.00 |
gpuarrays/reductions/mapreducedim! (3) | 42.20 | 0.01 | 0.0 | 1.54 | 636.00 | 0.80 | 1.9 | 4306.77 | 5675.00 |
gpuarrays/broadcasting (3) | 103.48 | 0.02 | 0.0 | 2.00 | 642.00 | 1.91 | 1.9 | 10033.50 | 8015.00 |
gpuarrays/reductions/== isequal (3) | 37.30 | 0.01 | 0.0 | 1.07 | 646.00 | 0.96 | 2.6 | 5579.53 | 8627.00 |
gpuarrays/base (3) | 16.83 | 0.00 | 0.0 | 8.90 | 646.00 | 0.60 | 3.6 | 2604.10 | 9059.00 |
gpuarrays/random (3) | 9.34 | 0.02 | 0.2 | 392.05 | 766.00 | 0.14 | 1.5 | 1508.73 | 9599.00 |
gpuarrays/vectors (3) | 0.20 | 0.00 | 0.2 | 0.00 | 648.00 | 0.00 | 0.0 | 18.08 | 9599.00 |
gpuarrays/ext/jld2 (3) | 5.54 | 0.00 | 0.0 | 0.00 | 648.00 | 0.04 | 0.7 | 325.20 | 9707.00 |
gpuarrays/constructors (3) | 14.37 | 0.01 | 0.0 | 0.65 | 648.00 | 0.18 | 1.3 | 1166.46 | 9851.00 |
gpuarrays/reductions/mapreduce (3) | 19.01 | 0.01 | 0.1 | 1.83 | 652.00 | 0.33 | 1.7 | 2205.48 | 9995.00 |
gpuarrays/statistics (3) | 37.38 | 0.01 | 0.0 | 1.51 | 718.00 | 0.65 | 1.7 | 3696.44 | 11039.00 |
gpuarrays/linalg/norm (3) | 82.73 | 0.02 | 0.0 | 0.02 | 722.00 | 1.29 | 1.6 | 7597.49 | 14099.00 |
gpuarrays/linalg/NaN_false (3) | 9.72 | 0.00 | 0.0 | 0.00 | 724.00 | 0.05 | 0.5 | 800.11 | 14675.00 |
gpuarrays/math/intrinsics (3) | 1.14 | 0.00 | 0.0 | 0.00 | 724.00 | 0.00 | 0.0 | 91.07 | 14675.00 |
gpuarrays/linalg/mul!/matrix-matrix (3) | 55.64 | 0.02 | 0.0 | 0.13 | 726.00 | 0.79 | 1.4 | 5628.03 | 15467.00 |
gpuarrays/sparse (3) | 0.00 | 0.00 | 0.0 | 0.00 | 726.00 | 0.00 | 0.0 | 0.15 | 15467.00 |
gpuarrays/reductions/mapreducedim!_large (3) | 5.90 | 0.02 | 0.3 | 818.38 | 766.00 | 0.10 | 1.6 | 1984.85 | 16301.81 |
gpuarrays/uniformscaling (3) | 4.14 | 0.00 | 0.0 | 0.01 | 726.00 | 0.00 | 0.0 | 275.87 | 16301.81 |
gpuarrays/reductions/minimum maximum extrema (3) | 106.89 | 0.02 | 0.0 | 2.19 | 732.00 | 1.89 | 1.8 | 10842.66 | 18428.56 |
gpuarrays/reductions/any all count (3) | 5.19 | 0.00 | 0.0 | 0.00 | 734.00 | 0.06 | 1.1 | 571.55 | 18500.56 |
gpuarrays/indexing multidimensional (3) | 29.06 | 0.00 | 0.0 | 2.07 | 822.00 | 0.45 | 1.6 | 2615.50 | 19040.56 |
gpuarrays/indexing find (3) | 13.49 | 0.00 | 0.0 | 0.13 | 822.00 | 0.36 | 2.6 | 1651.20 | 19364.56 |
gpuarrays/linalg/mul!/vector-matrix (3) | 33.45 | 0.01 | 0.0 | 0.02 | 822.00 | 0.62 | 1.8 | 3597.05 | 19832.56 |
gpuarrays/math/power (3) | 8.32 | 0.00 | 0.0 | 0.01 | 822.00 | 0.22 | 2.7 | 1355.35 | 19868.56 |
gpuarrays/linalg/core (3) | 104.86 | 0.27 | 0.3 | 5409.05 | 970.00 | 2.46 | 2.3 | 34659.54 | 26517.25 |
gpuarrays/reductions/reducedim! (3) | 0.48 | 0.00 | 0.5 | 1.03 | 832.00 | 0.00 | 0.0 | 21.95 | 26517.25 |
gpuarrays/indexing scalar (3) | 5.54 | 0.00 | 0.0 | 0.01 | 832.00 | 0.07 | 1.3 | 522.67 | 26517.25 |
gpuarrays/alloc cache (3) | 0.83 | 0.00 | 0.0 | 0.00 | 832.00 | 0.00 | 0.0 | 111.58 | 26517.25 |
libraries/cusparse (3) | 70.95 | 0.12 | 0.2 | 23.36 | 844.00 | 1.04 | 1.5 | 5825.04 | 26517.25 |
libraries/cusolver/dense (3) | 121.86 | 0.23 | 0.2 | 280.34 | 1220.00 | 2.04 | 1.7 | 11838.60 | 26517.25 |
base/array (3) | 35.09 | 0.02 | 0.1 | 1316.20 | 2282.00 | 0.66 | 1.9 | 4712.71 | 27821.00 |
From worker 3: WARNING: Method definition var"#10662#kernel"(Any) in module Main at /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/core/execution.jl:358 overwritten at /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/core/execution.jl:366.
core/execution (3) | 19.61 | 0.00 | 0.0 | 0.02 | 1162.00 | 0.25 | 1.3 | 1839.64 | 27821.00 |
libraries/cublas/extensions (3) | 21.95 | 0.06 | 0.3 | 36.69 | 1166.00 | 0.39 | 1.8 | 2112.08 | 27821.00 |
core/cudadrv (3) | failed at 2025-12-09T09:29:03.454
libraries/cublas/level2 (4) | 38.29 | 0.01 | 0.0 | 1.35 | 774.00 | 1.34 | 3.5 | 4123.64 | 2130.88 |
libraries/cublas/level3/gemm (4) | 56.03 | 0.03 | 0.0 | 8.95 | 784.00 | 1.58 | 2.8 | 6013.99 | 3030.88 |
libraries/cublas/level3 (4) | 33.65 | 0.02 | 0.1 | 5.74 | 784.00 | 0.53 | 1.6 | 2495.24 | 3642.88 |
libraries/cusparse/generic (4) | 40.23 | 0.12 | 0.3 | 14.22 | 788.00 | 0.69 | 1.7 | 3458.64 | 4470.88 |
libraries/cublas/xt (4) | 7.39 | 0.00 | 0.0 | 0.98 | 856.00 | 0.07 | 0.9 | 468.19 | 4650.88 |
base/sorting (4) | failed at 2025-12-09T09:32:47.805
core/device/intrinsics/wmma (5) | 42.65 | 0.01 | 0.0 | 0.63 | 626.00 | 1.07 | 2.5 | 4834.52 | 2435.00 |
libraries/cusparse/interfaces (5) | 99.18 | 0.31 | 0.3 | 47.12 | 636.00 | 2.66 | 2.7 | 9399.03 | 4297.25 |
libraries/cufft (5) | 52.88 | 0.03 | 0.1 | 198.03 | 884.00 | 1.17 | 2.2 | 5886.47 | 5161.25 |
core/device/intrinsics/atomics (5) | 11.62 | 0.00 | 0.0 | 0.00 | 838.00 | 0.16 | 1.4 | 971.84 | 5652.12 |
libraries/cusparse/conversions (5) | 7.72 | 0.02 | 0.2 | 1.73 | 838.00 | 0.22 | 2.8 | 878.62 | 5760.12 |
libraries/cusolver/dense_generic (5) | 39.17 | 0.02 | 0.1 | 15.11 | 1228.00 | 0.98 | 2.5 | 4277.75 | 6840.12 |
base/texture (5) | 22.44 | 0.00 | 0.0 | 0.10 | 1224.00 | 0.63 | 2.8 | 3079.93 | 7236.12 |
core/device/intrinsics/cooperative_groups (5) | 27.14 | 0.01 | 0.0 | 20.50 | 1222.00 | 0.26 | 1.0 | 1799.05 | 9108.12 |
core/device/intrinsics (5) | failed at 2025-12-09T09:38:13.958
libraries/cublas/level1 (6) | 31.71 | 0.01 | 0.0 | 0.03 | 688.00 | 1.33 | 4.2 | 3672.91 | 1770.69 |
libraries/cusparse/bmm (6) | 25.32 | 0.02 | 0.1 | 0.99 | 778.00 | 1.29 | 5.1 | 3873.55 | 2454.69 |
core/device/array (6) | 3.48 | 0.00 | 0.0 | 0.00 | 778.00 | 0.08 | 2.2 | 391.56 | 2490.69 |
base/random (6) | 20.09 | 0.01 | 0.0 | 4352.59 | 778.00 | 0.39 | 1.9 | 1838.40 | 3282.69 |
libraries/cusolver/sparse (6) | 14.40 | 0.00 | 0.0 | 0.22 | 844.00 | 0.33 | 2.3 | 1354.57 | 3390.69 |
core/device/intrinsics/memory (6) | 4.58 | 0.00 | 0.0 | 0.02 | 844.00 | 0.06 | 1.4 | 385.94 | 3534.69 |
core/codegen (6) | 2.19 | 0.00 | 0.0 | 0.00 | 844.00 | 0.03 | 1.4 | 156.67 | 3678.69 |
core/device/intrinsics/math (6) | 22.61 | 0.00 | 0.0 | 0.00 | 846.00 | 0.38 | 1.7 | 1921.37 | 4866.69 |
core/device/intrinsics/output (6) | 6.29 | 0.00 | 0.0 | 0.00 | 846.00 | 0.14 | 2.3 | 742.78 | 5010.69 |
core/device/random (6) | 23.24 | 0.01 | 0.0 | 0.37 | 850.00 | 0.30 | 1.3 | 1674.00 | 5946.69 |
libraries/cusparse/device (6) | 1.43 | 0.00 | 0.0 | 0.01 | 850.00 | 0.04 | 2.6 | 192.60 | 5982.69 |
libraries/cusolver/multigpu (6) | 10.35 | 0.03 | 0.3 | 545.60 | 1512.00 | 0.14 | 1.3 | 840.54 | 6522.69 |
core/device/ldg (6) | 4.50 | 0.00 | 0.0 | 0.00 | 858.00 | 0.12 | 2.6 | 548.39 | 6558.19 |
libraries/cusparse/broadcast (6) | 46.83 | 0.01 | 0.0 | 0.13 | 860.00 | 1.08 | 2.3 | 5251.58 | 7530.19 |
libraries/cusolver/base (6) | 0.10 | 0.00 | 0.0 | 0.00 | 860.00 | 0.00 | 0.0 | 1.86 | 7530.19 |
core/pointer (6) | 0.25 | 0.00 | 0.0 | 0.00 | 860.00 | 0.00 | 0.0 | 7.63 | 7530.19 |
base/broadcast (6) | 9.84 | 0.00 | 0.0 | 0.00 | 862.00 | 0.16 | 1.6 | 945.92 | 7890.19 |
core/nvml (6) | 0.57 | 0.00 | 0.0 | 0.00 | 862.00 | 0.00 | 0.0 | 53.32 | 7890.19 |
libraries/cusparse/linalg (6) | 44.54 | 0.10 | 0.2 | 6.78 | 864.00 | 1.22 | 2.7 | 5339.69 | 8934.19 |
base/exceptions (6) | failed at 2025-12-09T09:49:21.644
libraries/cusolver/sparse_factorizations (7) | 22.63 | 0.01 | 0.0 | 18.32 | 772.00 | 1.43 | 6.3 | 3396.66 | 2057.56 |
core/profile (7) | 276.86 | 0.00 | 0.0 | 0.00 | 766.00 | 10.35 | 3.7 | 81968.94 | 2993.56 |
base/iterator (7) | 2.64 | 0.00 | 0.0 | 1.93 | 766.00 | 0.08 | 2.9 | 392.02 | 2993.56 |
base/threading (7) | 3.05 | 0.01 | 0.2 | 10.94 | 832.00 | 0.10 | 3.3 | 357.20 | 2993.56 |
core/utils (7) | 0.61 | 0.00 | 0.0 | 0.00 | 830.00 | 0.01 | 1.7 | 70.97 | 2993.56 |
core/pool (7) | 2.31 | 0.00 | 0.0 | 0.00 | 638.00 | 0.67 | 28.9 | 244.97 | 2993.56 |
libraries/cusparse/sparse_matrices_csr (7) | 3.58 | 0.00 | 0.1 | 1.48 | 638.00 | 0.10 | 2.9 | 370.75 | 2993.56 |
base/linalg (7) | 42.14 | 0.02 | 0.1 | 1554.64 | 704.00 | 2.68 | 6.4 | 15829.82 | 5038.81 |
libraries/cusparse/reduce (7) | 15.85 | 0.11 | 0.7 | 0.06 | 704.00 | 0.38 | 2.4 | 1874.93 | 5038.81 |
libraries/staticarrays (7) | 1.02 | 0.00 | 0.0 | 0.00 | 704.00 | 0.03 | 3.4 | 193.18 | 5038.81 |
base/kernelabstractions (7) | 30.98 | 0.01 | 0.0 | 71.03 | 820.00 | 1.20 | 3.9 | 3575.04 | 5038.81 |
base/examples (7) | 5.61 | 0.00 | 0.0 | 385.30 | 1204.00 | 0.67 | 12.0 | 1313.16 | 5195.44 |
libraries/curand (7) | 0.05 | 0.00 | 0.0 | 0.00 | 820.00 | 0.00 | 0.0 | 1.77 | 5195.44 |
Testing finished in 48 minutes, 39 seconds, 469 milliseconds
Worker 3 failed running test core/cudadrv:
Some tests did not pass: 2065 passed, 0 failed, 1 errored, 3 broken.
core/cudadrv: Error During Test at /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/core/cudadrv.jl:132
Got exception outside of a @test
CUDA error: limit is not supported on this architecture (code 215, ERROR_UNSUPPORTED_LIMIT)
Stacktrace:
[1] throw_api_error(res::CUDA.cudaError_enum)
@ CUDA /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/lib/cudadrv/libcuda.jl:30
[2] check
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/lib/cudadrv/libcuda.jl:37 [inlined]
[3] cuCtxGetLimit
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/GPUToolbox/JLBB1/src/ccalls.jl:33 [inlined]
[4] limit(lim::CUDA.CUlimit_enum)
@ CUDA /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/lib/cudadrv/context.jl:351
[5] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/core/cudadrv.jl:134 [inlined]
[6] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/juliaup/julia-1.10.10+0.aarch64.linux.gnu/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
[7] top-level scope
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/core/cudadrv.jl:134
[8] include
@ ./client.jl:494 [inlined]
[9] #12
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/runtests.jl:89 [inlined]
[10] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/setup.jl:70 [inlined]
[11] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/juliaup/julia-1.10.10+0.aarch64.linux.gnu/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
[12] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/setup.jl:70 [inlined]
[13] macro expansion
@ ./timing.jl:503 [inlined]
[14] top-level scope
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/setup.jl:69
[15] eval
@ ./boot.jl:385 [inlined]
[16] (::var"#inner#3"{Serialization.__deserialized_types__.var"#12#17"{String}, String, Symbol})()
@ Main /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/setup.jl:77
[17] runtests(f::Function, name::String, time_source::Symbol)
@ Main /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/setup.jl:135
[18] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::@Kwargs{})
@ Base ./essentials.jl:892
[19] invokelatest(::Any, ::Any, ::Vararg{Any})
@ Base ./essentials.jl:889
[20] (::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}})()
@ Distributed /cluster/projects/nn9874k/aklocker/juliaup/depot/juliaup/julia-1.10.10+0.aarch64.linux.gnu/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:287
[21] run_work_thunk(thunk::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)
@ Distributed /cluster/projects/nn9874k/aklocker/juliaup/depot/juliaup/julia-1.10.10+0.aarch64.linux.gnu/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:70
[22] (::Distributed.var"#109#111"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()
@ Distributed /cluster/projects/nn9874k/aklocker/juliaup/depot/juliaup/julia-1.10.10+0.aarch64.linux.gnu/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:287
Worker 4 failed running test base/sorting:
Some tests did not pass: 143 passed, 0 failed, 21 errored, 0 broken.
base/sorting: Error During Test at /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/base/sorting.jl:256
Got exception outside of a @test
CUDA error: limit is not supported on this architecture (code 215, ERROR_UNSUPPORTED_LIMIT)
Stacktrace:
[1] throw_api_error(res::CUDA.cudaError_enum)
@ CUDA /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/lib/cudadrv/libcuda.jl:30
[2] check
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/lib/cudadrv/libcuda.jl:37 [inlined]
[3] cuCtxGetLimit
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/GPUToolbox/JLBB1/src/ccalls.jl:33 [inlined]
[4] limit(lim::CUDA.CUlimit_enum)
@ CUDA /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/lib/cudadrv/context.jl:351
[5] quicksort!(c::CuArray{UInt8, 1, CUDA.DeviceMemory}; lt::typeof(isless), by::typeof(identity), dims::Int64, partial_k::Nothing, block_size_shift::Int64)
@ CUDA.QuickSortImpl /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/src/sorting.jl:477
[6] quicksort!
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/src/sorting.jl:473 [inlined]
[7] (::var"#check#129"{var"#init#127"})(block_size_shift::Int64)
@ Main /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/base/sorting.jl:266
[8] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/base/sorting.jl:273 [inlined]
[9] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/juliaup/julia-1.10.10+0.aarch64.linux.gnu/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
[10] top-level scope
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/base/sorting.jl:257
[11] include
@ ./client.jl:494 [inlined]
[12] #12
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/runtests.jl:89 [inlined]
[13] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/setup.jl:66 [inlined]
[14] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/juliaup/julia-1.10.10+0.aarch64.linux.gnu/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
[15] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/setup.jl:66 [inlined]
[16] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/src/utilities.jl:35 [inlined]
[17] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/src/memory.jl:835 [inlined]
[18] top-level scope
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/setup.jl:65
[19] eval
@ ./boot.jl:385 [inlined]
[20] (::var"#inner#3"{Serialization.__deserialized_types__.var"#12#17"{String}, String, Symbol})()
@ Main /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/setup.jl:77
[21] runtests(f::Function, name::String, time_source::Symbol)
@ Main /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/setup.jl:135
[22] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::@Kwargs{})
@ Base ./essentials.jl:892
[23] invokelatest(::Any, ::Any, ::Vararg{Any})
@ Base ./essentials.jl:889
[24] (::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}})()
@ Distributed /cluster/projects/nn9874k/aklocker/juliaup/depot/juliaup/julia-1.10.10+0.aarch64.linux.gnu/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:287
[25] run_work_thunk(thunk::Distributed.var"#110#112"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)
@ Distributed /cluster/projects/nn9874k/aklocker/juliaup/depot/juliaup/julia-1.10.10+0.aarch64.linux.gnu/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:70
[26] (::Distributed.var"#109#111"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()
@ Distributed /cluster/projects/nn9874k/aklocker/juliaup/depot/juliaup/julia-1.10.10+0.aarch64.linux.gnu/share/julia/stdlib/v1.10/Distributed/src/process_messages.jl:287
base/sorting: Error During Test at /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/base/sorting.jl:283
Test threw exception
Expression: check_sort!(Int, 1000000; alg = CUDA.QuickSort)
CUDA error: limit is not supported on this architecture (code 215, ERROR_UNSUPPORTED_LIMIT)
Stacktrace:
[1] throw_api_error(res::CUDA.cudaError_enum)
@ CUDA /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/lib/cudadrv/libcuda.jl:30
[2] check
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/lib/cudadrv/libcuda.jl:37 [inlined]
[3] cuCtxGetLimit
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/GPUToolbox/JLBB1/src/ccalls.jl:33 [inlined]
[4] limit(lim::CUDA.CUlimit_enum)
@ CUDA /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/lib/cudadrv/context.jl:351
[5] quicksort!(c::CuArray{Int64, 1, CUDA.DeviceMemory}; lt::typeof(isless), by::typeof(identity), dims::Int64, partial_k::Nothing, block_size_shift::Int64)
@ CUDA.QuickSortImpl /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/src/sorting.jl:477
[6] sort!(c::CuArray{Int64, 1, CUDA.DeviceMemory}, alg::CUDA.QuickSortAlg; lt::Function, by::Function, rev::Bool)
@ CUDA /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/src/sorting.jl:991
[7] sort!
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/src/sorting.jl:985 [inlined]
[8] #sort!#1296
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/src/sorting.jl:1000 [inlined]
[9] check_sort!(T::Type, N::Int64, f::Function; kwargs::@Kwargs{alg::CUDA.QuickSortAlg})
@ Main /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/base/sorting.jl:198
[10] check_sort!
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/base/sorting.jl:196 [inlined]
[11] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/base/sorting.jl:283 [inlined]
[12] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/juliaup/julia-1.10.10+0.aarch64.linux.gnu/share/julia/stdlib/v1.10/Test/src/Test.jl:669 [inlined]
[13] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/base/sorting.jl:283 [inlined]
[14] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/juliaup/julia-1.10.10+0.aarch64.linux.gnu/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
[15] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/base/sorting.jl:283 [inlined]
[16] macro expansion
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/juliaup/julia-1.10.10+0.aarch64.linux.gnu/share/julia/stdlib/v1.10/Test/src/Test.jl:1577 [inlined]
[17] top-level scope
@ /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/base/sorting.jl:281
base/sorting: Error During Test at /cluster/projects/nn9874k/aklocker/juliaup/depot/packages/CUDA/x8d2s/test/base/sorting.jl:284
Is this related to the same problem?