GPU gradient calculation with Zygote failling with an RTX 4060

Hello all, I’m having an issue with gradient calculation on gpu. The minimal code exemple is very minimal,

using Flux,CUDA,cuDNN

x = rand(Float32,1,1000) |> gpu

y = rand(Float32,1,1000) |> gpu

model = Flux.Chain(
    Flux.Dense(1,10,tanh),
    Flux.Dense(10,10,tanh),
    Flux.Dense(10,1)
) |> gpu

loss(model,x,y) = Flux.mse(model(x),y)

loss(model,x,y)

gradient(loss,model,x,y)

I can allocate fine, even the model, calculate the loss but when it comes to gradient, I get,

ERROR: CUDA error: an illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS)
ERROR: WARNING: Error while freeing DeviceBuffer(3.906 KiB at 0x000000020502c800):
CUDA.CuError(code=CUDA.cudaError_enum(0x000002bc), details=CUDA.Optional{String}(data=nothing))

Afterwards, I can’t do anything gpu related.

julia version : 1.9.4

  [052768ef] CUDA v5.1.1
  [587475ba] Flux v0.14.7
  [02a925ec] cuDNN v1.2.1

update : loss(model,x,y) = norm(model(x).-y) works perfectly well, an issue for Flux.jl should be the way to go