Hello all,
I am trying to assemble many individual elements of a huge system to a global system. The global system is sparse, It is a certain thing. Also it is wasteful to store all zero elements in memory, I just need the nonzero elements. So I wanted to use a sparse array for my program.
If I use a classical “zeros” array, I don’t get an error. However if I change the type to sparse array, I get an error. Example code:
clearconsole()
# Import related libraries
using CUDA
using StaticArrays
# Assembly kernel
function gpu_arbitrary_assemblage(global_mat, indices_coor, indices_contribution)
tidx = (blockIdx().x - 1) * blockDim().x + threadIdx().x #calculate thread ID number
if tidx <= 10
indices = indices_coor[:, tidx]
CUDA.@atomic global_mat[indices[1], indices[2]] += indices_contribution[tidx] # perform atomic addition to many array locations
end
return nothing
end
checking = "not works 1"
# the global array definition. It all starts with zeros, or lets say empty values.
if checking == "works"
global_mat = CUDA.fill(0.0f0, (6, 6))
elseif checking == "not works 1"
global_mat = CUSPARSE.CuSparseMatrixCSR(CUDA.fill(0.0f0, (6, 6)))
elseif checking == "not works 2"
global_mat = CUSPARSE.CuSparseMatrixCSC(CUDA.fill(0.0f0, (6, 6)))
end
# the indices, information for add which value to "where".
# I have no problem with this part, I solved it. There is also a topic which I created about this.
# The topic is "CUDA.jl - Sub-Vector Indexing Problem Inside CUDA Kernel"
indices_coor = SMatrix{2, 10, Int32}([3 3 1 1 4 4 4 2 5 5;
5 5 2 2 4 4 6 3 2 2])
indices_contribution = cu([0.6, 0.6, 0.7, 0.7, 0.8, 0.8, 1, 1, 0.9, 0.9]) #random float numbers to be added on global array
# display the initial global array
display(global_mat)
# check the type of the array
display(typeof(global_mat))
#cuda kernel
@cuda threads = (32, 1, 1) gpu_arbitrary_assemblage(global_mat, indices_coor, indices_contribution);
#show the result
display(global_mat)
You can test this by changing the “checking” variable to “works”, “not works 1” and “not works 2”. The errors I get are:
- Reason: unsupported call through a literal pointer (call to jl_alloc_string)
- Reason: unsupported call through a literal pointer (call to memset)
- Reason: unsupported call through a literal pointer (call to jl_array_grow_end)
- Reason: unsupported call through a literal pointer (call to jl_array_del_end)
As you can see, if i change the type of the array, the kernel does not work. Is there any solution for me to use sparse array instead of a normal array? I use Juno editor. Thank you.