Hello everyone! I’m taking my first steps with Julia and programming GPUs, also this is my first post here. So please forgive me if the question is too banal or has been answered before I could not find anything useful in my brief research.
I’ve got a CPU function which essentially performs a simple vector operation on an array.
function cpuf!(
signal1::Array,
signal2::Array,
start_sample::Integer,
num_samples_left::Integer
)
for i = start_sample:num_samples_left + start_sample - 1
signal1[i] = signal1[i] * signal2[i]
end
end
It does it in any possible range of an array determined by the arguments start_sample
and num_samples_left
.
So the naive solution below would probably induce scalar operations on a CuArray and slow the process down, right?
function gpuf!(
signal1::CuArray,
signal2::CuArray,
start_sample::Integer,
num_samples_left::Integer
)
for i = start_sample:num_samples_left + start_sample - 1
signal1[i] = signal1[i] * signal2[i]
end
end
If so how do I get to this code with maximum performance?
function gpuf!(
signal1::CuArray,
signal2::CuArray,
start_sample::Integer,
num_samples_left::Integer
)
signal1 = signal1 .* signal2
#where signal1 and signal2 are in the range [start_sample:num_samples_left + start_sample - 1]
end
Do I need to do the splitting on the CPU then upload it onto the GPU?