Hello! I’m rookie for Julia programming language.

I’m trying to create the loss function, which calculate cosine similarity for each row pair in x, x̂.

The simply code is in below.

```
using Flux, CUDA, LinearAlgebra
CUDA.allowscalar(false)
function get_model(device)
# Some process to create model
return model |> device
end
function cosine_similarity(x, y)
return dot(x, y) / (norm(x) * norm(y))
end
function predict(x, model)
return model(x)
function eval_loss(x, model)
x̂ = predict(x)
cos_sim = cosine_similarity.(eachrow(x), eachrow(x̂))
cos_loss = sum(1 .- maximum(cos_sim, dims = 1))
return cos_loss
end
function test(x, device)
model = get_model(device)
x = x |> device
ps = Flux.params(model)
return Flux.gradient(() -> eval_loss(x, model), ps)
end
test(x, cpu)
test(x, gpu)
```

Basically, it worked perfectly on cpu, but when I changed device to gpu, it appeared the CUDA scalar indexing error.

Scalar indexing is disallowed.

Invocation of getindex resulted in scalar indexing of a GPU array.

This is typically caused by calling an iterating implementation of a method.

Such implementationsdo notexecute on the GPU, but very slowly on the CPU,

and therefore are only permitted from the REPL for prototyping purposes.

If you did intend to index this array, annotate the caller with @allowscalar.

After debugging, I realized that the problem is because of the vectorize function.

cos_sim = cosine_similarity.(eachrow(x), eachrow(x̂))

I have no idea how to solve it, I had tried `map`

, `broadcast`

, but the problem is still existed, is there any method to solve this problem?

Many thanks!

===============================

Edited:

Split x and x̂ from matrix to vector, and rewrite cosine similarity solved my problem. Thanks a lot!