Monitoring GPU utilization and memory usage

While training Flux models, I would like to monitor the (Nvidia) GPU utilization and memory usage. Is there an alternative to dump output from nvidia-smi to a file and read and plot the data in Julia?

At JuliaCon 2021 Tim Besard showed how he uses NVIDIA Nsight and other profilers… if you haven’t seen it already, maybe that might help? CUDA.jl 3.0 | Tim Besard | JuliaCon2021 - YouTube and/or GPU programming in Julia | Workshop | JuliaCon 2021 - YouTube

Good luck!

1 Like

Yes, CUDA.jl has a NVML submodule which exposes functionality like nvidia-smi. For memory usage, see memory_info. Not all NVML functions are wrapped, so have a look at the NVIDIA documentation too; all C functions are available under the NVML namespace as well.

2 Likes