I’ve been able to get source code annotation working with the CUDA.jl source code but not with my own kernels.
For example, the following MWE does not seem to work and complains with unsupported call to unknown function
.
using CUDA
import NVTX
function kernel!(a)
tid = threadIdx().x
NVTX.@mark "here"
a[tid] = 1.0
return
end
function main()
n = 512
a = CUDA.rand(n)
@cuda threads=n kernel!(a)
return
end
main()
CUDA.@profile main()