Your program is written correctly, whether execution will overlap or not is up to the CUDA driver. You also better use a proper profiler to verify whether execution overlaps or not (i.e., NSight Systems).
Your program is written correctly, whether execution will overlap or not is up to the CUDA driver. You also better use a proper profiler to verify whether execution overlaps or not (i.e., NSight Systems).