Several questions about KernelAbstractions

fedoroff · January 17, 2022, 8:51am

Hello,

I have several questions about KernelAbstractions package:

What is the current status of KernelAbstractions in the Julia GPU ecosystem? Is it being seen as an exploration package to try out some possible directions of heterogeneous CPU/GPU programming, or is it aiming to become a standard for future code development, agnostic to CPU/GPU?
Why the return statements are not permitted in kernel functions? There is no such issue with kernels written with CUDA.jl.
Why one have to be so strict with the kernel event ending? I see wait(event) in every example in the documentation. In the same time, as far as I understand, kernels written with CUDA.jl are also asynchronous, but no one forces to use @sync after every call.
In CUDA.jl there is launch_configuration function which allows to measure an optimal number of threads and blocks to launch a kernel. Is there a similar function in KernelAbstractions?

jpsamaroo · January 17, 2022, 2:22pm

KA (KernelAbstractions) is currently acting as a more minimal, cross-vendor alternative to writing vendor-specific kernel functions. I don’t think it’s going anywhere but up; it already works well with CUDA and AMDGPU (WIP), and it’s kept well maintained and tested by @vchuravy and users in the HPC space.

My guess is that this causes problematic behavior due to thread divergence, but I’m not clear on the exact reasoning. It might also be related to how KA optimizes code, and that return statements could make that harder if code paths diverge significantly.

It’s not strict, just explicit. You don’t have to call wait(event) just after a kernel is launched; in fact, you don’t need to ever call it if you don’t want to, it’s just an indicator that a kernel is finished, and lets other dependent kernels execute in the order that the user expected. AMDGPU.jl also does this, and it has worked out well.

For CUDAKernels, launch configuration will be calculated automatically if workgroupsize is set to nothing.

maleadt · January 17, 2022, 2:57pm

return value statements are also not permitted in CUDA.jl (only return or return nothing; I assume the same applies to KA.jl). The reason is that there’s no clear meaning – what if different threads return different values – and it could makie the kernel launch synchronous. CUDA C also does not allow returning values.

fedoroff · January 18, 2022, 6:23am

Thank you.

Are you aware of any other packages similar to KA?

By the way, where I can find the sources of CUDAKernels package? For some reason I do not see them on github.

fedoroff · January 18, 2022, 6:25am

Would it be better to allow return nothing statement in KA for consistency?

maleadt · January 18, 2022, 7:53am

It’s a subpackage: https://github.com/JuliaGPU/KernelAbstractions.jl/tree/master/lib/CUDAKernels

fedoroff · January 18, 2022, 7:54am

Thank you.

Topic		Replies	Views
KernelAbstractions is slower than CUDA GPU gpu , cuda , kernelabstractions	8	1322	November 10, 2022
Asynchronous kernel scheduling with KernelAbstractions GPU	6	313	July 3, 2025
Using functions in GPU Kernel (via KernelAbstractions.jl) (k nearest neighbor kernel) GPU	1	942	January 25, 2021
Using KernelAbstractions.jl in a package extension General Usage package-extensions	6	217	December 15, 2024
How to benchmark a function that uses KernelAbstractions kernels? GPU question , kernelabstractions	4	126	March 17, 2025

Several questions about KernelAbstractions

Related topics