I think the problem here is that your scalar k=0.25 is Float64, while the arrays are Float32. This promotes the array (which you probably don’t want, for performance reasons), and that leads to mixed element types in the gradient, which is what’s giving an error:
julia> using CUDA, LinearAlgebra
julia> CUDA.allowscalar(false)
julia> dot(CuArray([1f0, 2f0]), CuArray([3f0, 4f0]))
11.0f0
julia> dot(CuArray([1f0, 2f0]), CuArray([3.0, 4.0]))
ERROR: Scalar indexing is disallowed.
...
Stacktrace:
...
[5] dot(x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, y::CuArray{Float64, 1, CUDA.Mem.DeviceBuffer})
julia> using Zygote
julia> gradient(x -> sum(x * 3f0), CuArray([1f0, 2f0]))[1]
2-element CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}:
3.0
3.0
julia> gradient(x -> sum(x * 3.0), CuArray([1f0, 2f0]))[1]
ERROR: Scalar indexing is disallowed.
...
[5] dot(x::CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, y::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer})