I am unsure how to use the CUDA.Const
function to mark an array as constant during the execution of a CUDA kernel. From the few examples I saw, it looks like I just need to define a variable within the kernel that “shadow” the original array, as in cxs = CUDA.Const(xs)
, and then I can directly use the new variable cxs
. I tried that and it works, but surprisingly I do not see any benefit from its usage, although I would expect a significant one (the same kernel, rewritten with shared memory for xs
, is almost twice as fast).
Am I missing something or is this expected?