Error with Flux.update! with Metal gpu backend

Bill_Holmes · August 18, 2023, 7:29pm

I’m trying to get Flux up and running with Metal as the gpu backend. The documentation for this is great, but I keep running into an issue. I’ve distilled the issue into a very simple FFNN. The code below, defines a simple NN with some simple fake data. Everything moves to the GPU just fine, I can evaluate the model on the data, and I can take the gradient using Flux just fine. However, the last line is where the issue arises. When I try to use Flux to update the model parameters using the gradient, I get a very long error. See screenshots below for my metal versioninfo along with the error info. Note that I fresh updated Julia and all packages, so everything should be current.

The error trace is much longer than that shown below. There is repetition of (“unsupported unsupported use of double value”) though. Not sure how Float64 values would have entered into things, but it must be something in Flux.update! that is causing the issue (Flux.train has same issue). In this simple example, gradients are fine, it is the Adam step that is the problem.

Any thoughts or suggestions?

using Flux, Metal

# Define simple model
model = Chain(Dense(100 => 10,tanh), Dense(10 => 2)) |> gpu;

# Setup simple input - output data
x_in = randn(100,1) |> gpu;

y_out = [1.0 , 0.0] |> gpu;

# Setup simple loss function
loss(m,x,y) = Flux.logitcrossentropy(m(x),y);

# Setup optimizer
opt_setup = Flux.setup(Adam(),model) |> gpu ;

grads = gradient(m -> loss(m, x_in, y_out), model);

typeof(grads[1][1][1].weight); # Just checking that the grads output is still on gpu. It is.

Flux.update!(opt_setup, model, grads[1])

mcabbott · August 19, 2023, 2:50pm

This is Error in `update!` for Metal arrays and Adam optimiser · Issue #150 · FluxML/Optimisers.jl · GitHub. I think that using Flux.setup(Optimisers.Adam(), model) will avoid it.

Topic		Replies	Views
Metal.jl and Flux.jl on M1 chip GPU gpu , flux	2	1033	March 6, 2024
Weird error in Flux model New to Julia question , flux , zygote	4	202	January 11, 2024
Params not getting updated during training New to Julia flux	25	1729	October 11, 2020
Flux.jl: training fails at GPU but works on CPU Machine Learning gpu , flux	1	630	September 19, 2019
Flux with Metal backend slower than on CPU (Apple M2 Pro) General Usage question , gpu , apple , metaljl	5	689	April 18, 2024

Error with Flux.update! with Metal gpu backend

Related topics