Hi, I’m trying to write a kernel (using KernelAbstractions.jl) to evaluate a function f
over an arbitrary dimensional Cartesian grid, which is represented by vectors rs...
. To give a concrete example, for a 2D grid I can write the following:
using KernelAbstractions
@kernel function map_grid_kernel!(dest, f, x, y)
J = @index(Global, NTuple)
dest[J...] = f(x[J[1]], y[J[2]])
end
function map_grid!(dest, f, x, y)
backend = get_backend(dest)
f! = map_grid_kernel!(backend)
f!(dest, f, x, y; ndrange=size(dest))
end
Then the following works just fine:
x = LinRange(-3, 3, 128)
y = copy(x)
dest = Array{Float64}(undef, length(x), length(y))
f(x...) = exp(-sum(abs2, x))
map_grid!(dest, f, x, y)
I’m having problems generalizing this to accept grids of arbitrary dimension. Although I’m able to write
@kernel function map_grid_arbitrary!(dest, f, rs...)
J = @index(Global, NTuple)
dest[J...] = f(ntuple(n -> rs[n][J[n]], ndims(dest))...)
end
function map_grid_arbitrary!(dest, f, rs...)
backend = get_backend(dest)
f! = map_grid_arbitrary!(backend)
f!(dest, f, rs...; ndrange=size(dest))
end
this version allocates more then the previous one when called with the same arguments, and is also much slower. I couldn’t find a solution that matches the speed of the explicit 2D case. Does anyone has an idea of what can be done here? Thanks in advance!