I’m new to CuArrays
and there’s something I’m not quite understanding. I have a working example that pipes the data and the model to the GPU and trains there. But when I try to call the loss function outside of the training loop as loss(X, Y)
, I get the error
ArgumentError: cannot take the CPU address of a CuArray{Float32,2,Nothing}
even though I’ve previously moved X
and Y
to the GPU (with X |> gpu; Y |> gpu
).
I found that the call loss(X |> gpu, Y |> gpu)
works… but why should I have to do that if I’ve already moved X
and Y
to the GPU earlier in the code? Isn’t this horribly inefficient? The data doesn’t change, so I should only have to move it to the GPU once. It’s likely I’m misunderstanding something here, so any help would be appreciated! Minimum working example below.
using Flux
using CuArrays
num_samples, Ny, n = 50, 3, 5
X = rand(1,num_samples)
Y = rand(Ny, num_samples)
data = [(X[:,i], Y[:,i]) for i in 1:size(X,2)] |> gpu
X |> gpu
Y |> gpu
m = Chain(Dense(size(X,1),n,relu),Dense(n,n,tanh),Dense(n,size(Y,1))) |> gpu
loss(x, y) = Flux.mse(m(x), y)
ps = Flux.params(m)
for i = 1:3
println("Epoch "*string(i))
Flux.train!(loss, ps, data, ADAM()) #Later: , cb=()->@show loss(X, Y)
end
@show loss(X |> gpu, Y |> gpu) # works
@show loss(X, Y) # does not work, even though X |> gpu before training
This produces the output below, with the error coming from the final line.
Epoch 1
Epoch 2
Epoch 3
loss(X |> gpu, Y |> gpu) = 0.13247906f0
ArgumentError: cannot take the CPU address of a CuArray{Float32,2,Nothing}
Stacktrace:
[1] unsafe_convert(::Type{Ptr{Float32}}, ::CuArray{Float32,2,Nothing}) at C:\Users\username\.julia\packages\CuArrays\9n5uC\src\array.jl:226
[2] gemm!(::Char, ::Char, ::Float32, ::CuArray{Float32,2,Nothing}, ::Array{Float32,2}, ::Float32, ::Array{Float32,2}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\LinearAlgebra\src\blas.jl:1167
[3] gemm_wrapper!(::Array{Float32,2}, ::Char, ::Char, ::CuArray{Float32,2,Nothing}, ::Array{Float32,2}, ::LinearAlgebra.MulAddMul{true,true,Bool,Bool}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\LinearAlgebra\src\matmul.jl:597
[4] mul! at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\LinearAlgebra\src\matmul.jl:169 [inlined]
[5] mul! at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\LinearAlgebra\src\matmul.jl:208 [inlined]
[6] * at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\LinearAlgebra\src\matmul.jl:160 [inlined]
[7] (::Dense{typeof(relu),CuArray{Float32,2,Nothing},CuArray{Float32,1,Nothing}})(::Array{Float32,2}) at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:122
[8] Dense at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:133 [inlined]
[9] (::Dense{typeof(relu),CuArray{Float32,2,Nothing},CuArray{Float32,1,Nothing}})(::Array{Float64,2}) at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:136
[10] applychain at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:36 [inlined]
[11] (::Chain{Tuple{Dense{typeof(relu),CuArray{Float32,2,Nothing},CuArray{Float32,1,Nothing}},Dense{typeof(tanh),CuArray{Float32,2,Nothing},CuArray{Float32,1,Nothing}},Dense{typeof(identity),CuArray{Float32,2,Nothing},CuArray{Float32,1,Nothing}}}})(::Array{Float64,2}) at C:\Users\username\.julia\packages\Flux\Fj3bt\src\layers\basic.jl:38
[12] loss(::Array{Float64,2}, ::Array{Float64,2}) at .\In[7]:11
[13] top-level scope at show.jl:613
[14] top-level scope at In[7]:19