Adding at specific CuArray position

I haven’t thought too much about what you want to do, but the way to use @atomic is as shown here:

function count_indices(indices::CuArray{Int64,1}, maxSize::Int64)
    # Initialize a CuArray of zeros with size maxSize
    counts = CUDA.zeros(Int64, maxSize)

    # Define the kernel function
    function kernel(indices, counts)
        idx = (blockIdx().x - 1) * blockDim().x + threadIdx().x
        if idx <= length(indices)
            CUDA.@atomic counts[indices[idx]] += 1
        end
        return
    end

    # Launch the kernel
    threads = 256
    blocks = cld(length(indices), threads)
    @cuda threads=threads blocks=blocks kernel(indices, counts)

    return counts
end


count_indices(CuArray([1, 2, 3, 1, 3, 3, 3, 1, 1, 1, 1]), 4)

With result:

julia> count_indices(CuArray([1, 2, 3, 1, 3, 3, 3, 1, 1, 1, 1]), 4)
4-element CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}:
 6
 1
 4
 0

Some tips;

  1. Use CuVector instead of CuArray
  2. Use Int instead of specifying specifically IntXX
  3. If you are going to call this function a lot, preallocate count outside and make a in-place function, count_indices!. Then you can always define the function to do everything at once again

Kind regards