I’m having trouble with a basic copy operation on the GPU. My goal is to copy a chosen subset of data from an N-D array into a vector. My following two attempts have failed.
using CUDA
CUDA.allowscalar(false)
a=CUDA.rand(2,4)
c=cu([CartesianIndex(1,2), CartesianIndex(2,3)])
b = a[c]
Fails due to scalar operation
b=CUDA.zeros(2)
copyto!(b, CartesianIndices(b), a, c)
Produces a MethodError
I must be missing something very simple. Any advice? I know I can write a custom kernel, but this seems like a fairly generic copy operation.
I should have mentioned that I started with a view and found it is treated as a scalar operation.
using CUDA
CUDA.allowscalar(false)
a=CUDA.rand(2,4)
c=cu([CartesianIndex(1,2), CartesianIndex(2,3)])
c = view(a, c)
If the CartesianIndices are on the CPU, then view works. In my case they begin on the GPU (returned from findall()) and benchmarking shows I should avoid copying them to the CPU.
I’m finding 2.0 is breaking a lot of our codebase.
Objects created from view(reinterpret(x)) and view(reshape(x)) are now producing errors when I broadcast into them. Should I start opening bug reports?
Please do. Note that the new behavior is much more in line with Base, so if you’re getting errors its likely that corresponding Array operations would have been slow too (i.e., they wouldn’t have dispatched to BLAS).