CUDA(.jl) memory errors for very large kernels

The errors aren’t misleading in that this is what the CUDA API returns, but you’re just generating very bad atypical kernel code that exhausts the amount of memory that can be spilled, resulting in an “out of memory” error that doesn’t correspond to the usual exhausting of available device memory. You should try and change your code generation as to not use that many registers, and not spill that much, because regardless of whether you get this to compile or not the kernel is expected to execute extremely slowly, since you’ll be getting very low occupancy numbers. (TBF there are very niche cases where extremely low occupancy kernels can perform well, but those are very few and far between.)

2 Likes