What is the optimal way of updating CuArray?

CUDAdrv.synchronize() helps! Indeed it is an effect of asynchronisation. @Keno is right, the processes wasn’t actually terminated as they appeared to be. I have inserted the CUDAdrv.synchronize() function in between the timing functions, and get some understandable timing profiles now. Thank you very much @maleadt @Keno !!