The cooperative launch test is broken, and fixed in https://github.com/JuliaGPU/CUDA.jl/pull/517
The CUBLAS issues are unknown to me, and seem pretty problematic. It would be interesting to reduce them to isolated failures, because it’s hard to tell exactly which inputs cause failures (and whether the failures are reproducible).