The allocations reported by BenchmarkTools are CPU allocations, and there are always some when launching kernels (we need to allocate kernel parameter buffers to pass to CUDA). To see GPU allocations, you can use CUDA.@time
.
The allocations reported by BenchmarkTools are CPU allocations, and there are always some when launching kernels (we need to allocate kernel parameter buffers to pass to CUDA). To see GPU allocations, you can use CUDA.@time
.