I am actually using this to pass CuArrays to PyTorch Tensor :
using PyCall tc = pyimport("torch") using CuArrays x=cu(randn(1024,1024)) @time tc.from_numpy(Array(x)).cuda() 0.025017 seconds (62 allocations: 20.00 MiB) ...
Array(x) is actually moving x from GPU to CPU, and
.cuda() from CPU to GPU, I was wondering if there is a way to stay on the GPU and convert CuArrays to PyTorch tensor without having to go back and forth between CPU and GPU.
tc.from_numpy() on CuArrays is not working and gives a SegFault
Let me know if you have any ideas about how to do this.
Thank you !