I’m doing my first steps on running Julia 1.6.5 code on GPU. For some reason, it seems the GPU is not being used at all. These are the steps:
First of all, my GPU passed on the test recommended at https://cuda.juliagpu.org/stable/:
# install the package
using Pkg
Pkg.add("CUDA")
# smoke test (this will download the CUDA toolkit)
using CUDA
CUDA.versioninfo()
using Pkg
Pkg.test("CUDA") # takes ~40 minutes if using 1 thread
Secondly, the below code took around 8 minutes (real time) for supposedly running on my GPU. It loads and multiplies, for 10 times, two matrices 10000 x 10000:
using CUDA
using Random
N = 10000
a_d = CuArray{Float32}(undef, (N, N))
b_d = CuArray{Float32}(undef, (N, N))
c_d = CuArray{Float32}(undef, (N, N))
for i in 1:10
global a_d = randn(N, N)
global b_d = randn(N, N)
global c_d = a_d * b_d
end
global a_d = nothing
global b_d = nothing
global c_d = nothing
GC.gc()
Outcome on terminal as follows:
(base) ciro@ciro-G3-3500:~/projects/julia/cuda$ time julia cuda-gpu.jl
real 8m13,016s
user 50m39,146s
sys 13m16,766s
Then, an equivalent code for the CPU is run. Execution time is equivalent:
using Random
N = 10000
for i in 1:10
a = randn(N, N)
b = randn(N, N)
c = a * b
end
Execution:
(base) ciro@ciro-G3-3500:~/projects/julia/cuda$ time julia cuda-cpu.jl
real 8m2,689s
user 50m9,567s
sys 13m3,738s
Moreover, by following the info on NVTOP screen command, it is weird to see the GPU memory and cores being loaded/unloaded accordingly, besides still using the same 800% CPUs (or eight cores) of my regular CPU, which is the same usage the CPU-version has.
Any hint is greatly appreciated. Thanks.