With ParallelStencil, is it possible to launch multiple kernels and sync later?

Check out the following: CUDA streams do not overlap

… and note that you can also use ParallelStencil.ParallelKernel.@get_priority_stream(i).

However, you might rather want to create one or a few larger kernel instead of all these small kernels…

2 Likes