Optimize code which uses KernelAbstractions.jl

roflmaostc · February 11, 2024, 4:24pm

Hi,

in RadonKA.jl I just launch the kernel with the following code:

[....]
    kernel! = radon_kernel!(backend)
    kernel!(sinogram::AbstractArray{T}, img, weights, in_height, 
            out_height, angles, mid, radius, absorb_f,
            ndrange=size(sinogram))
    KernelAbstractions.synchronize(backend)    
    return sinogram::typeof(img)
end


@kernel function radon_kernel!(sinogram::AbstractArray{T}, img::AbstractArray{T}, 
                               weights, in_height, out_height, angles, mid,
                               radius, absorb_f) where {T}
    i, iangle, i_z = @index(Global, NTuple)

I was wondering, because in the KA docs the groupsize is mentioned.
Should I care about it? And which reasonable value do I choose?
My arrays range from sizes like (256,256) to 3D arrays such as (512,512,512).

I also tried annotating @Const all arguments except sinogram. Didn’t improve performance.
Is there any other free performance tricks I can use?

Thanks!

Felix

vchuravy · February 11, 2024, 5:29pm

KA uses a limited form of auto-tuning to select the group size. I would recommend the native performance tools from CUDA to look at kernel performance.

Topic		Replies	Views
KernelAbstractions Autotuning GPU	1	329	December 25, 2023
KernelAbstractions is slower than CUDA GPU gpu , cuda , kernelabstractions	8	1326	November 10, 2022
KernelAbstractions on CPU. Why so slow? Performance kernelabstractions	8	450	May 24, 2023
How to benchmark a function that uses KernelAbstractions kernels? GPU question , kernelabstractions	4	126	March 17, 2025
Several questions about KernelAbstractions GPU gpu , cuda , kernelabstractions	6	1601	January 18, 2022

Optimize code which uses KernelAbstractions.jl

Related topics