Calling a function inside of a Kernel

James_Stickney · June 6, 2019, 8:10pm

This one has my head hurting. I’m not even sure how to explain the problem, but I’ll do my best.

I am working on a project using CUDAnative and one of my kernels takes a function as an argument.

I would like to pass additional arguments to this function. If I pass number as an argument it works, but if I pass a variable it doesn’t. Is there a way to “apply” variables as constants? Can this be done with a macro?

This minimual example shows what I mean:

using CUDAnative
using CuArrays

function applyfun(x::CuDeviceArray, F::CuDeviceArray, f)
    i = ( blockIdx().x - 1) * blockDim().x + threadIdx().x

    F[i] = f(x[i])
    return nothing
end

nthreads = 512
nblocks = 10
F = cuzeros(nthreads * nblocks)
x = CuArray( LinRange(0, 10, nthreads*nblocks) )

f(x, c) = c*x^2

#This workds
g1(x) = f(x,1)
@cuda blocks=nblocks threads=nthreads applyfun(x, F, g1 )

#This doesn't work
b = 1
g2(x) = f(x,b)
@cuda blocks=nblocks threads=nthreads applyfun(x, F, g2 )

maleadt · June 6, 2019, 10:43pm

That doesn’t work because the captured b can be modified. Make it a const and it works:

julia> const c = 1
1

julia> g3(x) = f(x,c)
g3 (generic function with 1 method)

julia> @cuda blocks=nblocks threads=nthreads applyfun(x, F, g3)

James_Stickney · June 7, 2019, 12:09am

I tried that, but it doesn’t help because I need to run a model that loops over the variable–it can’t be constant. Is there another way to do it?

maleadt · June 7, 2019, 1:04am

Sure, but then you can’t just capture a CPU variable. You’ll need to put that counter in GPU memory, e.g. using a single-element array.

James_Stickney · June 7, 2019, 2:23pm

Thanks, that solves my problem. What I found worked was to pass the variable into the kernel.

This works:

using CUDAnative
using CuArrays

function applyfun(x::CuDeviceArray, F::CuDeviceArray, f, C)
    i = ( blockIdx().x - 1) * blockDim().x + threadIdx().x

    F[i] = f(x[i], C)

    return nothing
end

nthreads = 512
nblocks = 10

F = cuzeros(nthreads * nblocks)
x = CuArray( LinRange(0, 10, nthreads*nblocks) )

f(x, c) = c*x^2

C = 2
@cuda blocks=nblocks threads=nthreads applyfun(x, F, f, C)

I would like to learn more about metaprograming, do you think its possible to capture variables using
a macro?

Topic		Replies	Views
CUDAnative: would it be possible to specify interface arguments rather than kernel arguments? GPU question	2	697	March 2, 2020
Base function in Cuda kernels General Usage cudanative , cuda	8	3208	March 15, 2019
Understanding GPU Kernels GPU	4	2586	April 10, 2018
Use GPU subfunction in a bigger-function? New to Julia cudanative , cuarrays	5	754	April 19, 2020
Using Math Functions Inside CUDA Kernel GPU parallel	1	1360	November 30, 2017

Calling a function inside of a Kernel

Related topics