I am using a relatively high spec GPU (1080-TI) but cannot get it to run at utilizations of more than 5% (according to task manager). I believe that it is being used (rather than the CPU) because the “GPU memory” goes up by about 1GB when I run the below code (also the datatype returned by the neural net is CuArrays.CuArray{Float32,2,Nothing}). I was wondering if anyone has an idea on how change the code/learning algorithm in order to make more use of the GPU? Or generally is very low utilization to be expected (maybe my simple test case is too simple for GPU use)?
using Statistics
using Flux, CUDA
using Random
# Making dummy data
obs = 1000000
x = rand(Float64, 10 , obs)
y = mean(x, dims=1) + sum(x, dims=1)
y[findall(x[4,:] .< 0.3)] .= 17 # Making it slightly harder.
x = x |> gpu
y = y |> gpu
opt = Descent()
# With a CPU
m_cpu = Chain(Dense(10,6),
Dense(6,5),
Dense(5,4),
Dense(4,3),
Dense(3,2),
Dense(2,1))
m_gpu = m_cpu |> gpu
m_gpu(x)
using CuArrays
CuArrays.allowscalar(false)
dataset_gpu = Flux.Data.DataLoader(x, y, batchsize=2^12, shuffle=true) |> gpu
loss_gpu(A, B) = Flux.mae(m_gpu(A),B)
println("Doing GPU training")
loss_gpu(x, y)
for i in 1:100 Flux.train!(loss_gpu, params(m_gpu), dataset_gpu, opt) end
loss_gpu(x, y)
I think this question is related to this one but there did not seem to be a conclusion here