Elegant kernel for vectors in matrix

Ribeiro · February 4, 2021, 8:38pm

Hello,
I’m trying to write a kernel that does some operations on vectors using CUDA.jl and @cuda. My code is essentially this:

points = Nx3 matrix with x,y,z coordinates of points
elements = Nx4 matrix with integers that list rows in "points" making up quadrilaterals
A = NxN matrix
for i in 1:N
 for j in 1:N
   a bunch of operations (and, unfortunately, some if statements) involving the points in elements[i,:] and elements[j,:]
   A[i,j] = a function of those operations
 end
end

A lot of the operations involve dot and cross. On the CPU I deal with that in a pretty clean way. Say:

point1=points[elements[i,1],:] # gives me an array with 3 numbers x,y,z
...
vec1 = point1-point2;
...
foo = dot(vec1,vec2)
bar = cross(vec1,vec2)

Is there an elegant way to do that on the GPU? Trying to use [i,:] doesn’t seem to work and extracting subvectors of an array doesn’t seem too easy (also, assigning subvectors of an array! Say A[1,:] = b).
I don’t even know how to create an array in a kernel. The only way that I managed to do some math was to index each individual position and do the calculations in the classic way (i.e., old school C-style, not Matlab-style).
Thanks a lot!

dpsanders · February 4, 2021, 9:33pm

Check out the StaticArrays.jl package. Make points into a Vector containing SVectors.

Ribeiro · February 5, 2021, 8:56am

Ah, that does seem more elegant than what I’m doing on the CPU! A=CuArray(Vector{SVector{3,Float64}}(undef,10))
works and I can do dot and cross! Thanks a lot!

Ribeiro · February 5, 2021, 12:57pm

@dpsanders you, sir, have just sped up a certain part of my code 10x on the CPU! Many thanks!
Now on to the GPU.

Topic		Replies	Views
CUDA.jl - Sub-Vector Indexing Problem Inside CUDA Kernel GPU cuda , error , cuarrays , error-message , staticarrays	2	1241	March 28, 2022
Add specific elements of a CUDA matrix GPU question , indexing , cuda , arithmetic	1	280	March 21, 2024
[blog post] Introduction to GPU programming Community gpu , cudanative , gpuarrays , blog-post	15	3317	December 20, 2018
Cross product with CUDA.jl GPU	2	651	June 11, 2021
Create a simple CUDA.sum kernel GPU	3	1958	January 3, 2021

Elegant kernel for vectors in matrix

Related topics