Using NVIDIA Nsight Systems

Hello,

I am trying to use CUDA profile using Nsight Systems, but I facing some issues with some warning.

I am using (2024.1.1.59) version of Nsight Systems.

omairyrm@vulture:~/Julia_rabab$ nsys launch  julia 
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.0 (2023-12-25)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using CUDA
┌ Warning: CUDA runtime library libcupti.so.11.8 was loaded from a system path. This may cause errors.
│ Ensure that you have not set the LD_LIBRARY_PATH environment variable, or that it does not contain paths to CUDA libraries.
└ @ CUDA ~/.julia/packages/CUDA/htRwP/src/initialization.jl:187

(@v1.10) pkg> st
Status `~/.julia/environments/v1.10/Project.toml`
  [6e4b80f9] BenchmarkTools v1.4.0
  [052768ef] CUDA v5.2.0
  [587475ba] Flux v0.14.11
  [91a5bcdd] Plots v1.40.1
  [02a925ec] cuDNN v1.3.0

julia> a = CUDA.rand(1024,1024,1024);

julia> sin.(a);

julia> CUDA.@profile sin.(a);
ERROR: CUPTIError: CUPTI doesn't allow multiple callback subscribers. Only a single subscriber can be registered at a time. (code 39, CUPTI_ERROR_MULTIPLE_SUBSCRIBERS_NOT_SUPPORTED)
Stacktrace:
  [1] throw_api_error(res::CUDA.CUPTI.CUptiResult)
    @ CUDA.CUPTI ~/.julia/packages/CUDA/htRwP/lib/cupti/libcupti.jl:11
  [2] check
    @ ~/.julia/packages/CUDA/htRwP/lib/cupti/libcupti.jl:21 [inlined]
  [3] cuptiActivityRegisterCallbacks
    @ ~/.julia/packages/CUDA/htRwP/lib/utils/call.jl:26 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/CUDA/htRwP/lib/cupti/wrappers.jl:188 [inlined]
  [5] macro expansion
    @ ./lock.jl:267 [inlined]
  [6] enable!(f::CUDA.Profile.var"#3#4"{var"##3#profiled_code"}, cfg::CUDA.CUPTI.ActivityConfig)
    @ CUDA.CUPTI ~/.julia/packages/CUDA/htRwP/lib/cupti/wrappers.jl:178
  [7] profile_internally(f::var"##3#profiled_code"; concurrent::Bool, kwargs::@Kwargs{})
    @ CUDA.Profile ~/.julia/packages/CUDA/htRwP/src/profile.jl:273
  [8] profile_internally(f::Function)
    @ CUDA.Profile ~/.julia/packages/CUDA/htRwP/src/profile.jl:239
  [9] top-level scope
    @ ~/.julia/packages/CUDA/htRwP/src/profile.jl:62
 [10] top-level scope
    @ ~/.julia/packages/CUDA/htRwP/src/initialization.jl:206

You have to use CUDA.@profile external=true, regular CUDA.@profile uses the internal profiler which clashes with NSight. I hope to auto-detect that in the future, but you have to be verbose for now.

5 Likes