[blog post] Introduction to GPU programming

Pretty awesome stuff. One thing that isn’t clear is that is writing a loop fast for GPU or does the operation have to be vectorized for it to be fast?

Basically ‘sum’ will be fast but is writing a loop to sum every element of the loop as fast. I caan test this particular case but are there principles that i van follow in general?
Also how do i invoke the GPU’s sorting functions to sort a large vector?

1 Like