``mod1`` based Periodic Indexing on GPUs

I am using the following approach to apply boundary conditions using mod1 for my Simulation code, that I want to be performant on GPUs:

struct DirichletArray{arr<:AbstractArray, T}
    data::arr
    core_boundary_value::T
    edge_boundary_value::T
end

function Base.getindex(arr::DirichletArray, i::Integer, j::Integer)
    nₓ, nᵧ = size(arr.data)
    if i < 1
        # Left/Upper Dirichlet boundary (i < 1)
        return arr.core_boundary_value
    elseif i > nₓ
        # Right/Lower Dirichlet boundary (i > nₓ)
        return arr.edge_boundary_value
    else
        # Periodic in y
        return @inbounds arr.data[i, mod1(j, nᵧ)]
    end
end

Base.size(arr::DirichletArray) = size(arr.data)

Base.length(arr::DirichletArray) = length(arr.data)

I am using the following function to create the array:

function array_with_bounds(gpuArrayType, nx, ny, boundary)
    array = gpuArrayType{Float64}(undef, nx, ny)
    if boundary == "dirichlet"
        n_core = 1.5
        n_edge = 0.5
        return DirichletArray(array, n_core, n_edge)
    else
        error("Unknown boundary type: $(boundary)")
    end
end

I am concerned about performance on GPUs since doing it this way seems like using scalar indexing. When I run it, it doesn’t return a scalar indexing error though.

How much performance am I losing by doing it this way? Is there a better way to do this apart from using Ghost cells?

I imagine you’re only going to use your getindex within kernels (possibly implicitly)? Then I don’t think there are going to be any big performance issues (though the mod1s will add some overhead of course). You can always just perform a quick benchmark to be certain.