Source code annotation using NVTX in CUDA.jl

I’ve been able to get source code annotation working with the CUDA.jl source code but not with my own kernels.

For example, the following MWE does not seem to work and complains with unsupported call to unknown function.

using CUDA                   
import NVTX                  
                             
function kernel!(a)           
    tid = threadIdx().x      
    NVTX.@mark "here"        
    a[tid] = 1.0             
    return                   
end                          
                             
function main()              
    n = 512      
    a = CUDA.rand(n)         
                             
    @cuda threads=n kernel!(a)
    return                   
end                          
                             
main()                       
CUDA.@profile main()         

NVTX is a CPU library, you cannot use it in kernels.

Ah! I knew it was something silly! Thanks!