CUDA.jl - Atomic Addition Error To A Sparse Array Inside CUDA Kernel

moukann · March 31, 2022, 9:13am

Hello all,

I am trying to assemble many individual elements of a huge system to a global system. The global system is sparse, It is a certain thing. Also it is wasteful to store all zero elements in memory, I just need the nonzero elements. So I wanted to use a sparse array for my program.

If I use a classical “zeros” array, I don’t get an error. However if I change the type to sparse array, I get an error. Example code:

clearconsole()
# Import related libraries
using CUDA
using StaticArrays

# Assembly kernel
function gpu_arbitrary_assemblage(global_mat, indices_coor, indices_contribution)
    tidx = (blockIdx().x - 1) * blockDim().x + threadIdx().x #calculate thread ID number
    if tidx <= 10
        indices = indices_coor[:, tidx]
        CUDA.@atomic global_mat[indices[1], indices[2]] += indices_contribution[tidx] # perform atomic addition to many array locations
    end
    return nothing
end

checking = "not works 1"
# the global array definition. It all starts with zeros, or lets say empty values.
if checking == "works"
    global_mat = CUDA.fill(0.0f0, (6, 6))
elseif checking == "not works 1"
    global_mat = CUSPARSE.CuSparseMatrixCSR(CUDA.fill(0.0f0, (6, 6)))
elseif checking == "not works 2"
    global_mat = CUSPARSE.CuSparseMatrixCSC(CUDA.fill(0.0f0, (6, 6)))
end


# the indices, information for add which value to "where".
# I have no problem with this part, I solved it. There is also a topic which I created about this.
# The topic is "CUDA.jl - Sub-Vector Indexing Problem Inside CUDA Kernel"
indices_coor = SMatrix{2, 10, Int32}([3 3 1 1 4 4 4 2 5 5;
                                      5 5 2 2 4 4 6 3 2 2])

indices_contribution = cu([0.6, 0.6, 0.7, 0.7, 0.8, 0.8, 1, 1, 0.9, 0.9]) #random float numbers to be added on global array

# display the initial global array
display(global_mat)

# check the type of the array
display(typeof(global_mat))

#cuda kernel
@cuda threads = (32, 1, 1) gpu_arbitrary_assemblage(global_mat, indices_coor, indices_contribution);

#show the result
display(global_mat)

You can test this by changing the “checking” variable to “works”, “not works 1” and “not works 2”. The errors I get are:

Reason: unsupported call through a literal pointer (call to jl_alloc_string)
Reason: unsupported call through a literal pointer (call to memset)
Reason: unsupported call through a literal pointer (call to jl_array_grow_end)
Reason: unsupported call through a literal pointer (call to jl_array_del_end)

As you can see, if i change the type of the array, the kernel does not work. Is there any solution for me to use sparse array instead of a normal array? I use Juno editor. Thank you.

maleadt · March 31, 2022, 10:14am

No, mutating a sparse array like that is not supported. General setindex! may require reallocation of the sparse array components to make place for that element, which is currently impossible (and generally a very bad thing to do like that on a GPU anyway, since this will wreck performance).

moukann · March 31, 2022, 10:41am

Thank you, I will continue to use normal arrays but having a feature like this would be wonderful.

maleadt · March 31, 2022, 10:57am

It’s not going to happen, though, it’s just inherently incompatible with how GPUs work. Only in specific cases, where you know the sparsity structure beforehand (and you can allocate and initialize the sparse array components on the host before the kernel launch) we can support this. For example, with broadcast: https://github.com/JuliaGPU/CUDA.jl/pull/1380

moukann · March 31, 2022, 11:18am

Actually I know where exactly to add values from DegreeOfFreedom and BoundaryCondition arrays but I have to finish my project on time, I cannot try this new feature. I just wanted to use it if existed. I will try to further optimize my program in summer. Thank you for answers.

Topic		Replies	Views
CUDA.jl - Memory Efficient Operations, Manipulations and Calculations on Large Sparse Arrays GPU gpu , gpuarrays , memory , sparse , simulations	6	1436	June 21, 2022
Initializing Sparse Matrices with CuArrays.jl GPU first-steps	4	1914	February 22, 2019
Use GPU to generate known sparse matrix GPU	2	2020	January 6, 2020
Using sparse matrix in CUDA kernel GPU question , gpu	1	798	November 27, 2023
First CUDA Program- Addressing Scalar Indexing GPU	5	1167	June 28, 2021

CUDA.jl - Atomic Addition Error To A Sparse Array Inside CUDA Kernel

Related topics