Problems copying data on the GPU

I’m having trouble with a basic copy operation on the GPU. My goal is to copy a chosen subset of data from an N-D array into a vector. My following two attempts have failed.

using CUDA
CUDA.allowscalar(false)
a=CUDA.rand(2,4)
c=cu([CartesianIndex(1,2), CartesianIndex(2,3)])
b = a[c]

Fails due to scalar operation

b=CUDA.zeros(2)
copyto!(b, CartesianIndices(b), a, c)

Produces a MethodError

I must be missing something very simple. Any advice? I know I can write a custom kernel, but this seems like a fairly generic copy operation.

Use a view:

julia> c = view(a, [CartesianIndex(1,2), CartesianIndex(2,3)])
2-element view(::CuArray{Float32,2}, CartesianIndex{2}[CartesianIndex(1, 2), CartesianIndex(2, 3)]) with eltype Float32:
 0.81763107
 0.610714

julia> copyto!(b, c)
2-element CuArray{Float32,1}:
 0.81763107
 0.610714
2 Likes

I should have mentioned that I started with a view and found it is treated as a scalar operation.

using CUDA
CUDA.allowscalar(false)
a=CUDA.rand(2,4)
c=cu([CartesianIndex(1,2), CartesianIndex(2,3)])
c = view(a, c)

If the CartesianIndices are on the CPU, then view works. In my case they begin on the GPU (returned from findall()) and benchmarking shows I should avoid copying them to the CPU.

Note that I’m using the CUDA.jl master branch, where’s there has been a bunch of SubArray changes and fixes.

The master branch works. Thanks!

Do you know when you plan to release v1.3.4?

This week :slight_smile: Tentative release notes up here: https://juliagpu.org/2020-09-28-cuda_2.0/

1 Like

I’m finding 2.0 is breaking a lot of our codebase.

Objects created from view(reinterpret(x)) and view(reshape(x)) are now producing errors when I broadcast into them. Should I start opening bug reports?

Please do. Note that the new behavior is much more in line with Base, so if you’re getting errors its likely that corresponding Array operations would have been slow too (i.e., they wouldn’t have dispatched to BLAS).

1 Like