I need to be able to stack arbitrarily sized cuarrays and CuTextures as a single input to a kernel, so I can do operations in the kernel that loop over the data like so:
(get GitHub - cdsousa/CuTextures.jl: [DEPRECATED, moved into CUDA.jl] CUDA textures ("CUDA arrays") interface for native Julia to run example)
using CuTextures, CuArrays, CUDAnative, CUDAdrv
myImages = [CuTexture(CuTextureArray(CuArrays.rand(2,2))) for i = 1:20];
out = CuArrays.rand(1);
function myKernel!(myImages,out)
for i = 1:length(myImages)
out += myImages[i](1.3,1.2)
end
return nothing
end
@cuda threads=1 blocks=1 myKernel!(myImages,out)
The above errors out because “passing and using non-bitstype argument”
Is there a way to do this right now, or will there be?
My application requires n dimensional stacks of 2d images, and the reason I cannot simply concatenate them into three dimensional images, is because of the resulting memory allocation problem when I have to replace only one of many images