[blog post] Introduction to GPU programming

I am confused about this. So I can write alternating sum function as below

function altsum(x)
  res = zero(typeof(x))
  d = 1.0
  for x1 in x
    res = res + d1*x1
    d = d * - 1.0
  end
end

when I run that what gets executed in parallel? The inner loop? What sort of thinking do I have to do when thinking about GPUs?

using CuArrays, GPUArrays
x = rand(1_000_000)
gpux = CuArray(x)
@time altsum(x)
@time altsum(gpux)