Hi,
I am just wondering if there is any way to directly pass CuArrays
or CuPtr
in CUDA.jl
to C/C++, like passing Core.Array
or Core.Ref
. For example, I would like to define a C function which gets a device memory pointer (with type double*
, cuComplex*
, etc.) and pass CuArrays
or CuPtr
to it.
I am asking this question because I want to minimize the data transfer between a device and a host. Of course I can pass CuArrays
or CuPtr
to C functions by first transferring them to a host but if the size of CuArrays
becomes large, this will produce big overhead…