Help with Custom Struct for High Dimensional COO Arrays

ejmeitz · May 16, 2023, 2:49pm

Hello,

I have an application where I need >2 dimensional sparse arrays and I would like to test running them on the GPU as the application is embarrassingly parallel. My plan was to make a CuArray of the type TensorVal type using the function below. This function does run but only if I use @allowscalar. From my understanding its ok that this runs on the CPU because I’m just initializing the array. However, everything breaks when I try to use this array in the cuda_sparse function and I get index out of bounds error and the REPL crashes. I’m not sure if the issue is in the initialization or in the execution. Any help would be appreciated! If there are better ways to do this I’d love to hear them.

struct TensorVal
    i::Int32
    j::Int32
    k::Int32
    val::Float32
end

function get_non_zero_gpu(F3::SparseArray)
    num_nonzero = length(nonzero_values(F3))
    F3_non_zero = CuArray{TensorVal}(undef,(num_nonzero,))
    for (idx, val) in nonzero_pairs(F3)
        F3_non_zero[idx] = TensorVal(idx[1], idx[2], idx[3], val)
    end
    return F3_non_zero
end

For reference this is how I am trying to use the array (basically a high dimensional dot product):

function cuda_sparse(cuF3_sparse, cuPhi1, cuPhi2, cuPhi3)
    f = (f3_data) -> f3_data.val * cuPhi1[f3_data.i] * cuPhi2[f3_data.j] * cuPhi3[f3_data.k]
   return mapreduce(f, +, cuF3_sparse)
end

ejmeitz · May 16, 2023, 2:57pm

As it turns out I was being stupid, the get_non_zero_gpu() function should have been the code below. I would still love to hear if people have recommendations as to how I should go about this or if this approach seems optimal.

function get_non_zero_gpu(F3::SparseArray)
    num_nonzero = length(nonzero_values(F3))
    F3_non_zero = CuArray{TensorVal}(undef,(num_nonzero,))
    count = 1
    for (idx, val) in nonzero_pairs(F3)
        F3_non_zero[count] = TensorVal(idx[1], idx[2], idx[3], val)
        count += 1
    end
    return F3_non_zero
end

Topic		Replies	Views
CUDA.jl - Atomic Addition Error To A Sparse Array Inside CUDA Kernel GPU cuda , error , memory-allocation , sparse , atomic	4	706	March 31, 2022
Initializing Sparse Matrices with CuArrays.jl GPU first-steps	4	1894	February 22, 2019
Using sparse matrix in CUDA kernel GPU question , gpu	1	764	November 27, 2023
Use GPU to generate known sparse matrix GPU	2	1996	January 6, 2020
CUDA.jl - Memory Efficient Operations, Manipulations and Calculations on Large Sparse Arrays GPU gpu , gpuarrays , memory , sparse , simulations	6	1401	June 21, 2022

Help with Custom Struct for High Dimensional COO Arrays

Related topics