Tried this in a new Julia terminal, and it worked without errors:
julia> ENV["JULIA_DEBUG"] = "CUDA"
"CUDA"
julia> using CUDA
julia> CUDA.version()
┌ Debug: Could not unload the system CUDA library; this will prevent use of the forward-compatible package
└ @ CUDA C:\Users\sairu\.julia\packages\CUDA\tTK8Y\lib\cudadrv\CUDAdrv.jl:69
v"11.4.0"
julia> CUDA.versioninfo()
┌ Debug: Trying to use artifacts...
└ @ CUDA.Deps C:\Users\sairu\.julia\packages\CUDA\tTK8Y\deps\bindeps.jl:164
┌ Debug: Selecting artifacts based on driver compatibility 11.4.0
└ @ CUDA.Deps C:\Users\sairu\.julia\packages\CUDA\tTK8Y\deps\bindeps.jl:178
Downloaded artifact: CUDA
┌ Debug: Using CUDA 11.7.0 from an artifact at C:\Users\sairu\.julia\artifacts\fd3b38cf5ade69a121c1ed6bc7a0a47f930ac0a1
└ @ CUDA.Deps C:\Users\sairu\.julia\packages\CUDA\tTK8Y\deps\bindeps.jl:205
CUDA toolkit 11.7, artifact installation
NVIDIA driver 472.19.0, for CUDA 11.4
CUDA driver 11.4
Libraries:
- CUBLAS: 11.10.1
- CURAND: 10.2.10
- CUFFT: 10.7.1
- CUSOLVER: 11.3.5
- CUSPARSE: 11.7.3
- CUPTI: 17.0.0
- NVML: 11.0.0+472.19
Downloaded artifact: CUDNN
┌ Debug: Using CUDNN from an artifact at C:\Users\sairu\.julia\artifacts\1a54a5d914c297394fc55b6aa7e4e68ff283303e
└ @ CUDA.Deps C:\Users\sairu\.julia\packages\CUDA\tTK8Y\deps\bindeps.jl:576
- CUDNN: 8.30.2 (for CUDA 11.5.0)┌ Debug: CuDNN (v8302) function cudnnGetVersion() called:
│ Time: 2022-06-27T19:41:03.405872 (0d+0h+0m+1s since start)
│ Process=1768; Thread=23056; GPU=NULL; Handle=NULL; StreamId=NULL.
└ @ CUDA.CUDNN C:\Users\sairu\.julia\packages\CUDA\tTK8Y\lib\cudnn\CUDNN.jl:136
Downloaded artifact: CUTENSOR
┌ Debug: Using CUTENSOR library cutensor from an artifact at C:\Users\sairu\.julia\artifacts\b82bb42c0e83d6eff5dca5be2acac39a1f088b91
└ @ CUDA.Deps C:\Users\sairu\.julia\packages\CUDA\tTK8Y\deps\bindeps.jl:643
- CUTENSOR: 1.4.0 (for CUDA 11.5.0)
Toolchain:
- Julia: 1.7.2
- LLVM: 12.0.1
┌ Debug: Toolchain with LLVM 12.0.1, CUDA driver 11.4 and toolkit 11.7 supports devices 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2, 7.5 and 8.0; PTX 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5 and 7.0
└ @ CUDA.Deps C:\Users\sairu\.julia\packages\CUDA\tTK8Y\deps\compatibility.jl:222
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0
┌ Debug: Toolchain with LLVM 12.0.1, CUDA driver 11.4 and toolkit 11.7 supports devices 3.5, 3.7, 5.0, 5.2, 5.3, 6.0, 6.1, 6.2, 7.0, 7.2, 7.5 and 8.0; PTX 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5 and 7.0
└ @ CUDA.Deps C:\Users\sairu\.julia\packages\CUDA\tTK8Y\deps\compatibility.jl:222
- Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80
1 device:
0: NVIDIA GeForce RTX 3060 Laptop GPU (sm_86, 4.628 GiB / 6.000 GiB available)
julia> [CUDA.capability(dev) for dev in CUDA.devices()]
1-element Vector{VersionNumber}:
v"8.6.0"
julia> cu(rand(5,5))*cu(rand(5))
5-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
0.43582758
0.67973685
0.99499965
0.8340414
0.49534583