Most efficient way of _waiting_ for GPU results?

This is fantastic!
with above fix
running 8 julia processors (2 video cards). Both GPUs are 100% loaded. CPU load dropped from 68% (8 core cpu) down to 7-16% range!
That is great! Thank you very much for your effort and patience!