Two CuDeviceArrays inside one kernel

Hello everyone,

I was wondering if I can achieve something like this:

function test(output, output2, input)

    # Set up shared memory cache for this current block.
    cache1 = @cuDynamicSharedMem(Int64, (10,10,3))
    cache2 = @cuDynamicSharedMem(Int64, (5,5))

end

I have two small arrays which I compute on the device so I want them to be there for faster access. However, I’m not sure if I can use shared memory for this because in my example cache2 overwrites cache1. Is there any way to have two separate arrays which are shared among one thread block? I tried to read about CuDeviceArrays but can’t find any example how to use them. I’ll really appreciate your help.

Greetings