How would I retrieve the result from CUDA.@profile?

Ahmed_Salih · October 29, 2023, 9:23pm

Hi!

Checking out the new release of CUDA.jl, which really has made profiling a lot easier! Much appreciated. I was wondering though, how do I retrieve the result from the profiling?

#https://info.juliahub.com/cuda-jl-5-0-changes
julia> a = CUDA.rand(1024, 1024, 1024)
julia> CUDA.@profile trace=true CUDA.@sync a .+ a
Profiler ran for 12.29 ms, capturing 527 events.

Host-side activity: calling CUDA APIs took 11.75 ms (95.64% of the trace)
┌─────┬───────────┬───────────┬────────┬─────────────────────────┐
│  ID │     Start │      Time │ Thread │                    Name │
├─────┼───────────┼───────────┼────────┼─────────────────────────┤
│   5 │   6.91 µs │  13.59 µs │      1 │ cuMemAllocFromPoolAsync │
│   9 │  36.72 µs │ 199.56 µs │      1 │          cuLaunchKernel │
│ 525 │ 510.69 µs │  11.75 ms │      2 │     cuStreamSynchronize │
└─────┴───────────┴───────────┴────────┴─────────────────────────┘

What I wish is similar to BenchmarkTools, where one can output the timing into a variable, and display it later / manipulate the data etc. I am not sure how to do that with CUDA though?

Perhaps I am missing something obvious

Kind regards

maleadt · October 30, 2023, 9:11am

The reason we don’t have such an API, is that it’s not clear what ‘result from profiling’ should be returned. Is it the kernel times? The API times? The NVTX ranges? Or just the total execution time of the code on the CPU? Or on the GPU? The profiler UI reports all that information.

If you just want to measure the total execution time, you should use Base.@elapsed CUDA.@sync ... for the CPU time, and CUDA.@elapsed ... for the GPU time.

If you want more specific times, there’s CUDA.@profiled now (on CUDA.jl#master), which returns the internal state of the profiler, but the formatting of that isn’t documented. That’s on purpose, so it can change without requiring a breaking release.

Ahmed_Salih · October 30, 2023, 4:16pm

Thanks for the explanation!

Perfect, I will see if I can get those to work well for me - no matter what this is a huge step up from before, so still very pleased

On a side note. I am not sure who selected your answer as solution, and while I agree it is and very well explained, it would be nice if I could have the option to select the solution

vchuravy · October 31, 2023, 5:47pm

Maybe we should return an object that contains the data and uses show to be nicely displayed.

I noticed the other day that the current output to stdout doesn’t look as nicely on Pluto.

maleadt · October 31, 2023, 6:08pm

That’s a good idea, and would allow unifying @profile and @profiled.

carstenbauer · October 31, 2023, 9:59pm

Unfortunately, the solution button and the like button are very close to each other and it is very easy to misclick, especially on a mobile phone. So maybe that’s what happened. In any case, you should be able to revoke the solution tag (I think).

maleadt · November 1, 2023, 4:19pm

I implemented this in Profiler: Improve compatibility with Pluto.jl and friends. by maleadt · Pull Request #2139 · JuliaGPU/CUDA.jl · GitHub. CUDA.@profiled is no more; CUDA.@profile now returns a ProfileResult object that displays like before. You can peek into that struct to get the data of the profile, but I’d recommend adding some accessors (e.g. get_kernels(::ProfileResult), or an iterator, etc) based on your use case, and submitting a pull request for that. I’m happy to support accessors like that, but the internal fields can change between releases.

Ahmed_Salih · November 1, 2023, 5:20pm

Ah, thank you, I will imagine that that was the case

I see @maleadt and others have been going on about implementing what I was asking about, so I think it is fair to mark his new answer at the bottom as the answer for now!

Kind regards

Topic		Replies	Views
CUDA Profiler General Usage cuda	4	118	August 28, 2024
Different results with different macros for profiling GPU cuda	3	128	August 6, 2024
Nsight compute from CUDA.jl and source annotation Performance gpu	3	721	March 25, 2021
Profiling Julia CUDA code missing 'CUDA HW' GPU	7	950	February 9, 2022
Track function on profiler to CUDA documentation GPU	1	38	April 3, 2025

How would I retrieve the result from CUDA.@profile?

Related topics