I need to be able to stack arbitrarily sized cuarrays and CuTextures as a single input to a kernel, so I can do operations in the kernel that loop over the data like so:
(get https://github.com/cdsousa/CuTextures.jl to run example)
using CuTextures, CuArrays, CUDAnative, CUDAdrv myImages = [CuTexture(CuTextureArray(CuArrays.rand(2,2))) for i = 1:20]; out = CuArrays.rand(1); function myKernel!(myImages,out) for i = 1:length(myImages) out += myImages[i](1.3,1.2) end return nothing end @cuda threads=1 blocks=1 myKernel!(myImages,out)
The above errors out because “passing and using non-bitstype argument”
Is there a way to do this right now, or will there be?
My application requires n dimensional stacks of 2d images, and the reason I cannot simply concatenate them into three dimensional images, is because of the resulting memory allocation problem when I have to replace only one of many images