Cross product with CUDA.jl

I am looking for the way to compute cross product on a GPU. Basically something similar to this:

function cross(w::CuArray{Float32,1}, u::CuArray{Float32,1}, v::CuArray{Float32,1})
w[1] = u[2] * v[3] - u[3] * v[2]
w[2] = u[3] * v[1] - u[1] * v[3]
w[3] = u[1] * v[2] - u[2] * v[1]

Is it possible to be done with CuArrays only? Or I would need to write a kernel?

You could do something like this, although whether working with length 3 CuArrays is ever a good idea I’m not sure.

using Permutations, TensorCore, LinearAlgebra
const epsilon = Float64[allunique([i,j,k]) && -sign(Permutation([i,j,k])) for i in 1:3, j in 1:3, k in 1:3]
mycross(u,v) = u ⊡ epsilon ⊡ v

u, v = rand(3), rand(3)
cross(u,v) ≈ mycross(u,v)
1 Like

If you need many cross products I‘d make two CuVectors containing many StaticArrays and then broadcast the cross product over them. Didn‘t think too much about it, but should work.