Error handling in CUDA kernels

fedoroff · April 19, 2022, 2:33pm

In my CUDA kernel I check for a specific condition and would like to terminate the execution if the condition is fulfilled. What is the proper way to do so? How can I throw an error from inside a CUDA kernel?

vchuravy · April 19, 2022, 6:40pm

You can just call error("") but the user will only see the specific error when they run with julia -g2

fedoroff · April 20, 2022, 6:32am

Thank you. It works.

In case of normal (non debug run) the thrown error is accompanied by continuously repeating message

ERROR: a exception was thrown during kernel execution.
       Run Julia on debug level 2 for device stack traces.

I guess there is a typo in this message: should be “an exception” instead of “a exception”.

maleadt · April 26, 2022, 8:00am

The proper solution is to restructure your control flow such that you can do an early return. One alternative is to use the exit PTX instruction using LLVM.jl’s @asmcall, but not every GPU supports that instruction, and we’ve encountered miscompilations in the presence of such control flow.

fedoroff · April 26, 2022, 8:17am

I can do the early return. But how I can signal that the return was triggered by an error? In normal functions I can return some variable as an error code, but CUDA kernels do not allow to return anything.

maleadt · April 26, 2022, 8:30am

You can allocate a global flag and write to it from your kernel. This can be a single-element CuArray you pass as an argument, or something fancier (e.g., look for exception_flag in CUDA.jl, that’s a global flag allocated in CPU memory mapped into GPU address space so that we can more easily read the value without synchronizing the GPU).

fedoroff · April 26, 2022, 8:37am

Thank you. I will try.

Topic		Replies	Views
How to reset GPU after launch failure General Usage cudanative , debug , cuda	8	2709	March 18, 2022
Cuda kernel error New to Julia gpu	9	2891	December 9, 2019
Julia: How to set and use CUDA/Host Global memory GPU	3	1192	January 10, 2022
Debugging in CUDAnative GPU cudanative	1	699	February 5, 2020
How to solve error while generating random number inside kernel? General Usage gpu , cuda	4	367	November 17, 2020

Error handling in CUDA kernels

Related topics