I am just wondering if there is any way to directly pass
CUDA.jl to C/C++, like passing
Core.Ref. For example, I would like to define a C function which gets a device memory pointer (with type
cuComplex*, etc.) and pass
CuPtr to it.
I am asking this question because I want to minimize the data transfer between a device and a host. Of course I can pass
CuPtr to C functions by first transferring them to a host but if the size of
CuArrays becomes large, this will produce big overhead…