Is there a way to use @allowscalar in a heterogeneous manner using KernelAbstractions?

0samuraiE · April 8, 2025, 3:24am

Hi all,

Is it correct that scalar indexing like Out[1] is not allowed with KernelAbstractions.jl alone, and that we need GPUArrays.@allowscalar for this? Should we always use GPUArrays together with KernelAbstractions for such cases?

using KernelAbstractions
using CUDA
using GPUArrays

backend = CUDABackend()
Out = KernelAbstractions.zeros(backend, Float64, 1)
GPUArrays.@allowscalar Out[1]

maleadt · April 9, 2025, 8:24am

You can simply copy the array back to the CPU first, which is what Out[1] does behind the scenes anyway.

roflmaostc · April 9, 2025, 12:06pm

Is CUDA.jl copy the whole array back to CPU?

maleadt · April 10, 2025, 6:33am

It does not: GPUArrays.jl/src/host/indexing.jl at e8e9b031613f31818e75a6c7f8745788fb80b71f · JuliaGPU/GPUArrays.jl · GitHub
So yes, in the case your data is very large it’s better to index a single item or perform a fine-grained copy yourself.

Topic		Replies	Views
Overcoming Slow Scalar Operations on GPU Arrays GPU performance	19	6225	January 4, 2021
Copyto! does not work for subarray when scalar get index is disallowed? GPU question	1	465	October 18, 2019
GPU: Scalar indexing in kernel programming GPU cuda	2	258	June 5, 2023
Rotr90 of a CUDA.CuArray GPU	5	332	October 6, 2022
Scalar indexing is disallowed GPU cuda	4	1302	August 15, 2023

Is there a way to use @allowscalar in a heterogeneous manner using KernelAbstractions?

Related topics