MethodError in loss function when using Flux with GPU support

I’m building simple neural network in regression problem then I receive an error:

MethodError: no method matching loss(::Chain{Tuple{Dense{typeof(relu), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, typeof(identity)}}, ::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, ::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})

Closest candidates are:
loss(::Any, ::Any)
@ Main In[42]:14

Stacktrace:
[1] macro expansion
@ C:\Users\sofia.julia\packages\Zygote\gsq4u\src\compiler\interface2.jl:101 [inlined]
[2] _pullback(::Zygote.Context{false}, ::typeof(loss), ::Chain{Tuple{Dense{typeof(relu), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, typeof(identity)}}, ::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, ::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})
@ Zygote C:\Users\sofia.julia\packages\Zygote\gsq4u\src\compiler\interface2.jl:101
[3] _pullback
@ .\In[42]:22 [inlined]
[4] _pullback(ctx::Zygote.Context{false}, f::var"#22#23"{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}}, args::Chain{Tuple{Dense{typeof(relu), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, Dense{typeof(identity), CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, typeof(identity)}})

I run with CPU successfully then I redirect my custom NN to GPU then problem came.

function regression_nn(input_size, hidden_size)
    return Chain(
        Dense(input_size, hidden_size, relu),  # Hidden layer with sigmoid activation
        Dense(hidden_size, 1),              # Output layer (1 neuron for regression)
        identity
    ) |> gpu
end
# Set the input and hidden layer sizes
input_size = size(Xtrain, 1)
hidden_size = 8

# Create the model
model = regression_nn(input_size, hidden_size)

# Define the RMSE loss function
function rmse(ŷ, y)
    return sqrt(Flux.Losses.mae(ŷ, y))
end

# Define the loss function for training
loss(x, y) = rmse(model(x), y)

# Choose an optimizer (e.g., stochastic gradient descent)
opt = Descent(0.001)
gpu_train_loader = Flux.DataLoader((Xtrain |> gpu, Ytrain |> gpu), batchsize=32, shuffle=true)
epochs = 1000
for epoch in 1:epochs
    for (x, y) in gpu_train_loader
        grads = gradient(m -> loss(m, x, y), model)
        Flux.update!(opt_state, model, grads[1])
    end
end
(Xtrain, Xtest, Ytrain, Ytest) .|> typeof

julia> (Matrix{Float32}, Matrix{Float32}, Vector{Float32}, Vector{Float32})

Thanks for your help.

Since size(model(x)) == (1, batchsize), the true target should be reshaped to be 2d array with a single row.

Here you call the function loss with 3 arguments:

But here you define it with only 2 (with the model being a reference to the global variable):

I think you want loss(m, x, y) = rmse(m(x), y), which takes the model explicitly.

1 Like