Most efficient way of _waiting_ for GPU results?

https://github.com/JuliaGPU/CuArrays.jl/pull/245