Hello!
New to the Flux library.
I went through their documentian and understand how it works for arrays of one dimension. I am trying to use it on GPU. The code I use is:
using Flux, CUDA, StaticArrays
T = Float32
NL = 10^6
src = CuArray(rand(SVector{3,T},NL))
idx = CuArray(rand(1:6195, NL))
dst = CuArray(zeros(SVector{3,T}, NL))
NNlib.scatter!(+, dst, src, idx)
Which returns the error:
ERROR: InvalidIRError: compiling kernel #scatter_kernel!(typeof(+), CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceVector{Int64, 1}) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to atomic_cas!)
I’ve tested the CPU version and it works great - how come it breaks here?
Kind regards