Calling a function inside of a Kernel

This one has my head hurting. I’m not even sure how to explain the problem, but I’ll do my best.

I am working on a project using CUDAnative and one of my kernels takes a function as an argument.

I would like to pass additional arguments to this function. If I pass number as an argument it works, but if I pass a variable it doesn’t. Is there a way to “apply” variables as constants? Can this be done with a macro?

This minimual example shows what I mean:

using CUDAnative
using CuArrays

function applyfun(x::CuDeviceArray, F::CuDeviceArray, f)
    i = ( blockIdx().x - 1) * blockDim().x + threadIdx().x

    F[i] = f(x[i])
    return nothing
end

nthreads = 512
nblocks = 10
F = cuzeros(nthreads * nblocks)
x = CuArray( LinRange(0, 10, nthreads*nblocks) )

f(x, c) = c*x^2

#This workds
g1(x) = f(x,1)
@cuda blocks=nblocks threads=nthreads applyfun(x, F, g1 )

#This doesn't work
b = 1
g2(x) = f(x,b)
@cuda blocks=nblocks threads=nthreads applyfun(x, F, g2 )

That doesn’t work because the captured b can be modified. Make it a const and it works:

julia> const c = 1
1

julia> g3(x) = f(x,c)
g3 (generic function with 1 method)

julia> @cuda blocks=nblocks threads=nthreads applyfun(x, F, g3)

I tried that, but it doesn’t help because I need to run a model that loops over the variable–it can’t be constant. Is there another way to do it?

Sure, but then you can’t just capture a CPU variable. You’ll need to put that counter in GPU memory, e.g. using a single-element array.

Thanks, that solves my problem. What I found worked was to pass the variable into the kernel.

This works:

using CUDAnative
using CuArrays

function applyfun(x::CuDeviceArray, F::CuDeviceArray, f, C)
    i = ( blockIdx().x - 1) * blockDim().x + threadIdx().x

    F[i] = f(x[i], C)

    return nothing
end

nthreads = 512
nblocks = 10

F = cuzeros(nthreads * nblocks)
x = CuArray( LinRange(0, 10, nthreads*nblocks) )

f(x, c) = c*x^2

C = 2
@cuda blocks=nblocks threads=nthreads applyfun(x, F, f, C)

I would like to learn more about metaprograming, do you think its possible to capture variables using
a macro?