Hi,
in RadonKA.jl I just launch the kernel with the following code:
[....]
kernel! = radon_kernel!(backend)
kernel!(sinogram::AbstractArray{T}, img, weights, in_height,
out_height, angles, mid, radius, absorb_f,
ndrange=size(sinogram))
KernelAbstractions.synchronize(backend)
return sinogram::typeof(img)
end
@kernel function radon_kernel!(sinogram::AbstractArray{T}, img::AbstractArray{T},
weights, in_height, out_height, angles, mid,
radius, absorb_f) where {T}
i, iangle, i_z = @index(Global, NTuple)
I was wondering, because in the KA docs the groupsize
is mentioned.
Should I care about it? And which reasonable value do I choose?
My arrays range from sizes like (256,256)
to 3D arrays such as (512,512,512)
.
I also tried annotating @Const
all arguments except sinogram
. Didn’t improve performance.
Is there any other free performance tricks I can use?
Thanks!
Felix