I’m trying to get Flux up and running with Metal as the gpu backend. The documentation for this is great, but I keep running into an issue. I’ve distilled the issue into a very simple FFNN. The code below, defines a simple NN with some simple fake data. Everything moves to the GPU just fine, I can evaluate the model on the data, and I can take the gradient using Flux just fine. However, the last line is where the issue arises. When I try to use Flux to update the model parameters using the gradient, I get a very long error. See screenshots below for my metal versioninfo along with the error info. Note that I fresh updated Julia and all packages, so everything should be current.
The error trace is much longer than that shown below. There is repetition of (“unsupported unsupported use of double value”) though. Not sure how Float64 values would have entered into things, but it must be something in Flux.update! that is causing the issue (Flux.train has same issue). In this simple example, gradients are fine, it is the Adam step that is the problem.
Any thoughts or suggestions?
using Flux, Metal
# Define simple model
model = Chain(Dense(100 => 10,tanh), Dense(10 => 2)) |> gpu;
# Setup simple input - output data
x_in = randn(100,1) |> gpu;
y_out = [1.0 , 0.0] |> gpu;
# Setup simple loss function
loss(m,x,y) = Flux.logitcrossentropy(m(x),y);
# Setup optimizer
opt_setup = Flux.setup(Adam(),model) |> gpu ;
grads = gradient(m -> loss(m, x_in, y_out), model);
typeof(grads[1][1][1].weight); # Just checking that the grads output is still on gpu. It is.
Flux.update!(opt_setup, model, grads[1])