Indexing in GPU kernel

shakedregev · March 31, 2023, 5:34pm

I’m building a custom GPU kernel and am having trouble indexing into a GPU vector. Q is a matrix stored in CSR format
qrows - row pointers
qcols - nonzero column indices
qvals - nonzero values
Here is a code snippet:

function q_kernel!(qrows::CuDeviceVector{Int64}, qcols::CuDeviceVector{Int64}, qvals::CuDeviceVector{Float64})
    index = (blockIdx().x - 1) * blockDim().x + threadIdx().x
    stride = gridDim().x * blockDim().x
    tot::Float64=0.0
    @simd for i = index:stride:length(qrows)-1 #loop of the rows of the matrix (same number of rows for J and Q)
        @inbounds colind = qrows[i]:qrows[i+1]-1
        @inbounds indj = qcols[colind] #find column indices of all the non zero elements of row i in Q
 for j in indj # this loops over all the non zero elements of row i in Q

The second to last line is getting the errors:

LoadError: InvalidIRError: compiling kernel
Reason: unsupported dynamic function invocation (call to print_to_string(xs...) in Base at strings/io.jl:133)
Reason: unsupported call through a literal pointer (call to ijl_alloc_array_1d)

What is the correct way to index into qcols on the GPU?

vchuravy · March 31, 2023, 6:36pm

You can try using view. qcols[coldind] creates a copy.

shakedregev · March 31, 2023, 7:07pm

Thanks, this worked!

@inbounds indj = @view qcols[colind] #find column indices of all the non zero elements of row i in Q

Topic		Replies	Views
CUDAnative: kernel multidimensional access GPU cudanative	3	1180	February 3, 2017
Is it possible to index a CuArray with a CuArray? GPU question	1	850	January 11, 2019
Modifying CuArray elements in-place with index arrays GPU indexing	4	1750	October 28, 2020
Mapping ThreadIdx().x to a 5D array? GPU	8	1230	June 15, 2018
Combine CartesianIndices for effective CUDA kernels GPU gpu , indexing	4	798	May 11, 2021

Indexing in GPU kernel

Related topics