Cannot take the CPU address of a CUDA.CuArray{Float32, 1, CUDA.Mem.DeviceBuffer} in LinearAlgebra matmul

Have such function, that usually calculated correct.

function predict_train(model::Model, batch)
  (calculate_hidden(model, batch, dropout_active=true) .^ 3) |> 
    model.upper_layer |>
    (upper) -> model.output_layer(upper') |>

Also have this loss function

loss = (sample) -> begin
  sum(sample) do (batch, gold)
    predict_train(model, batch) |> scores -> transition_loss(scores, gold)
  end + L2_norm(ps, training_context.settings)

In usual this works fine, i can call them separtly fine, but when I try call Zygote.gradient I get a CPU address error, that failed in ouput_layer calculation. I checked types before error, and both output weight and upper result is CuArray, but error still happens every time in LinearAlgebra.BLAS