Help converting a Pytorch tensor to Julia CuArray

Dale_James_Black · November 20, 2023, 10:00pm

I am trying to use PythonCall/juliacall to convert a PyTorch tensor to a Julia CuArray directly from the pointer (I want to avoid unnecessary copying if possible). Does anyone know what I am doing wrong, as it seems the output isn’t correct since the array should consist of 1s and 0s only (unless I am just misreading it)?

MWE Colab Link

Code

import torch

sz = (100, 100)
arr = np.random.choice([0, 1], size=sz)

# Step 1: Create a PyTorch tensor and transfer it to GPU
tensor = torch.tensor(arr, dtype=torch.float32).cuda()
print(tensor)

# Step 2: Convert to CuArray using PythonCall
cu_arr = jl.unsafe_wrap(jl.CuArray, jl.PythonCall.getptr(tensor), sz)
print(cu_arr)

Output

tensor([[0., 1., 0.,  ..., 1., 0., 0.],
        [0., 1., 1.,  ..., 1., 1., 1.],
        [0., 1., 1.,  ..., 0., 1., 0.],
        ...,
        [0., 0., 0.,  ..., 1., 1., 0.],
        [0., 1., 1.,  ..., 0., 1., 1.],
        [1., 1., 1.,  ..., 0., 0., 0.]], device='cuda:0')
PythonCall.C.PyObject[PythonCall.C.PyObject(2, Ptr{Nothing} @0x00005bbbbba26dd0) PythonCall.C.PyObject(0, Ptr{Nothing} @0x00005bbbbba26dd0) PythonCall.C.PyObject(133973972021008, Ptr{Nothing} @0x000079daf8e4c1b0) PythonCall.C.PyObject(133978876351936, Ptr{Nothing} @0x000079d93f02f5b0) PythonCall.C.PyObject(19, Ptr{Nothing} @0x7c25d4026712e6d2)...

cjdoris · November 20, 2023, 10:11pm

PythonCall.getptr is an internal function and does not get the pointer you want.

You can get the CUDA pointer from the __cuda_array_interface__.

Dale_James_Black · November 20, 2023, 10:42pm

Oh okay, so if I do this to get the underlying pointer, is there a way to convert that to a Julia pointer using juliacall?

import torch

sz = (100, 100)
arr = np.random.choice([0, 1], size=sz)

# Step 1: Create a PyTorch tensor and transfer it to GPU
tensor = torch.tensor(arr, dtype=torch.float32).cuda()

# Step 2: Get pointer of the tensor
ptr = tensor.data_ptr()
print("pointer: ", ptr)

pointer:  138064357793280

CUDA.jl provides a CuPtr{T} but I am having trouble converting this to a pointer that Julia understands

cjdoris · November 20, 2023, 10:48pm

CuPtr{Float32}(pyconvert(UInt, Ptr))

perhaps?

Dale_James_Black · November 20, 2023, 10:57pm

Oh this is almost there, I just need to figure out how to pass in a python variable into a jl.seval("") call. When I hardcode the pointer integer, this works

import torch

sz = (100, 100)
arr = np.random.choice([0, 1], size=sz)

# Step 1: Create a PyTorch tensor and transfer it to GPU
tensor = torch.tensor(arr, dtype=torch.float32).cuda()
print("Pytorch Tensor : ", tensor)

# Step 2: Get pointer of the tensor
ptr = tensor.data_ptr()
print("pointer: ", ptr)

# DOESN"T WORK
# cu_ptr = jl.seval("""
# CuPtr{Float32}(pyconvert(UInt, ptr))
# """)

# Convert to julia CuPtr (IDK how to pass in the variable `ptr`)
cu_ptr = jl.seval("""
CuPtr{Float32}(pyconvert(UInt, 138064357793280))
""")
print("julia pointer: ", cu_ptr)

# Convert to CUDA array
cu_arr = jl.unsafe_wrap(jl.CuArray, cu_ptr, sz)
cu_arr

Pytorch Tensor :  tensor([[0., 0., 0.,  ..., 0., 1., 1.],
        [0., 1., 0.,  ..., 1., 1., 1.],
        [0., 1., 1.,  ..., 0., 1., 0.],
        ...,
        [0., 1., 0.,  ..., 0., 1., 1.],
        [1., 0., 1.,  ..., 0., 1., 1.],
        [1., 1., 0.,  ..., 1., 0., 0.]], device='cuda:0')
pointer:  138064357752832
julia pointer:  CuPtr{Float32}(0x00007d919d009e00)
100×100 CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}:
 0.0  1.0  1.0  0.0  1.0  0.0  1.0  0.0  1.0  1.0  …  1.0  1.0  0.0  0.0  1.0  1.0  0.0  0.0  1.0
 0.0  1.0  0.0  0.0  1.0  0.0  1.0  1.0  0.0  0.0     1.0  0.0  0.0  0.0  1.0  0.0  1.0  0.0  0.0
 0.0  0.0  1.0  1.0  0.0  1.0  1.0  0.0  1.0  0.0     1.0  0.0  1.0  1.0  0.0  1.0  0.0  0.0  1.0
 0.0  1.0  0.0  1.0  0.0  1.0  0.0  1.0  1.0  0.0     0.0  0.0  1.0  0.0  1.0  1.0  0.0  1.0  1.0
 1.0  1.0  0.0  1.0  0.0  1.0  0.0  0.0  1.0  0.0     0.0  0.0  0.0  1.0  0.0  1.0  1.0  0.0  0.0
 1.0  0.0  0.0  0.0  0.0  1.0  1.0  1.0  0.0  1.0  …  0.0  0.0  1.0  1.0  1.0  0.0  1.0  0.0  0.0
 1.0  1.0  0.0  0.0  0.0  0.0  1.0  0.0  1.0  0.0     1.0  0.0  1.0  1.0  1.0  0.0  1.0  0.0  1.0
 1.0  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0  0.0     1.0  0.0  1.0  0.0  0.0  1.0  0.0  1.0  0.0
 0.0  1.0  0.0  0.0  1.0  0.0  0.0  1.0  0.0  0.0     0.0  1.0  1.0  0.0  1.0  0.0  0.0  0.0  1.0
 0.0  1.0  0.0  0.0  1.0  1.0  0.0  1.0  0.0  0.0     0.0  1.0  0.0  0.0  0.0  0.0  1.0  0.0  1.0
 ⋮                        ⋮                        ⋱                      ⋮                   
 0.0  0.0  1.0  1.0  1.0  1.0  1.0  0.0  1.0  1.0     0.0  0.0  1.0  0.0  0.0  1.0  0.0  1.0  0.0
 0.0  1.0  0.0  1.0  1.0  0.0  0.0  1.0  1.0  1.0     1.0  1.0  1.0  1.0  1.0  0.0  0.0  1.0  1.0
 0.0  0.0  1.0  0.0  1.0  0.0  1.0  0.0  1.0  1.0     1.0  0.0  0.0  0.0  0.0  1.0  0.0  1.0  0.0
 0.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0     1.0  0.0  0.0  0.0  1.0  0.0  0.0  0.0  0.0
 1.0  0.0  1.0  0.0  0.0  1.0  1.0  0.0  0.0  1.0  …  0.0  0.0  1.0  1.0  0.0  1.0  1.0  0.0  1.0
 0.0  1.0  0.0  1.0  0.0  0.0  0.0  1.0  1.0  0.0     0.0  0.0  1.0  1.0  0.0  1.0  0.0  1.0  1.0
 1.0  1.0  1.0  0.0  1.0  0.0  0.0  0.0  1.0  0.0     1.0  0.0  1.0  1.0  0.0  1.0  0.0  1.0  0.0
 0.0  1.0  0.0  1.0  0.0  0.0  1.0  0.0  0.0  0.0     1.0  0.0  1.0  0.0  1.0  0.0  1.0  1.0  1.0
 0.0  0.0  1.0  1.0  1.0  1.0  0.0  0.0  0.0  0.0     0.0  0.0  1.0  1.0  0.0  0.0  0.0  0.0  1.0

ToucheSir · November 20, 2023, 11:00pm

If you don’t mind the extra dependency, GitHub - pabloferz/DLPack.jl: Julia interface for dlpack makes this trivial.

Dale_James_Black · November 20, 2023, 11:02pm

Is that for CUDA only? I wrote the kernels for a package using KernelAbstractions.jl so I want to build this out in a vendor neutral way. I am just using Pytorch and CUDA for testing the implementation in python but it seems like the pointer approach might be more flexible than DLPack for this?

Dale_James_Black · November 20, 2023, 11:03pm

Specifically this package btw https://github.com/Dale-Black/DistanceTransforms.jl/blob/master/src/transform.jl

ToucheSir · November 20, 2023, 11:06pm

It is, but you could extend https://github.com/pabloferz/DLPack.jl/blob/main/src/cuda.jl to work for other GPU array types. See https://github.com/pabloferz/DLPack.jl/blob/main/src/DLPack.jl#L36-L39

cjdoris · November 21, 2023, 3:38pm

You can create an anonymous function to pass ptr in to:

cu_ptr = jl.seval("""
ptr -> CuPtr{Float32}(pyconvert(UInt, ptr))
""")(ptr)

(But the above suggestions to use DLPack are indeed simpler.)

Dale_James_Black · November 21, 2023, 9:48pm

Brilliant, thank you so much. I will look into DLPack more too. If it’s easy to extend to various GPU vendors that is incredible

Dale_James_Black · May 23, 2024, 10:50pm

Hi everyone,

I wanted to revisit this topic and see if anyone has any insights or recommendations on efficiently converting a PyTorch tensor to a Julia CuArray while keeping the data on the GPU throughout the process.

Regarding the pointer approach, I tried the following:

def cu_transform(tensor):
    sz = tuple(tensor.size())
    ptr = tensor.data_ptr()
    cu_ptr = jl.seval("""
    ptr -> CuPtr{Float32}(pyconvert(UInt, ptr))
    """)(ptr)
    cu_arr = jl.unsafe_wrap(jl.CuArray, cu_ptr, sz)
    return jl.transform(jl.boolean_indicator(cu_arr))

This approach works fine with torch.cuda tensors outside of training loops. However, when running it inside a training loop, I encountered a CUDA error: an illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS). I suspect there might be an issue with the memory management or synchronization between PyTorch and Julia within the training loop context.

As an alternative, I explored using DLPack.jl to handle the tensor conversion. According to the DLPack.jl README.md, I tried the following:

pyv = torch.arange(1, 5).reshape(2, 2)
v = DLPack.from_dlpack(pyv)

Unfortunately, I encountered an error:

JuliaError: ArgumentError: The input does not follow the DLPack specification
Stacktrace:
 [1] from_dlpack(o::PyIterable{Any})
   @ DLPack ~/.julia/packages/DLPack/1mZGE/src/DLPack.jl:194

Other workarounds I attempted also led to different errors.

My ultimate goal is to create a wrapper around a CUDA.jl-based function using PythonCall.jl (juliacall) for a Python library. The function works on GPU with a simple torch tensor outside of training loops, but I’m running into issues when integrating it into a training loop. It’s crucial for me to avoid converting back and forth between GPU and CPU, as it would negate the efficiency gains I’m aiming for.

If anyone has successfully used the pointer approach or DLPack.jl within a PyTorch training loop while keeping the data on the GPU throughout the process, I would greatly appreciate your guidance. Additionally, if there are any recommendations on how to seamlessly utilize multiple GPU platforms within a PyTorch training loop, similar to how diffeqpy does it, that would be fantastic.

Thank you in advance for any help or code examples you can provide!

Dale_James_Black · May 28, 2024, 9:16pm

Here is the working DLPack.jl version, thanks to @pabloferz

def transform_cuda(tensor):
    tensor_jl = DLPack.from_dlpack(juliacall.convert(PythonCall.Py, tensor))
    result_jl = DistanceTransforms.transform(DistanceTransforms.boolean_indicator(tensor_jl))
    return DLPack.share(result_jl, torch.from_dlpack)

pabloferz · May 29, 2024, 4:58am

I just released DLPack.jl version 0.3.0. With it, you should be able to just do:

tensor_jl = DLPack.from_dlpack(tensor)

from juliacall.

Dale_James_Black · May 29, 2024, 6:09pm

Not sure why but got a little speed boost from that as well!

Topic		Replies	Views
Passing CuArray to PyTorch Tensor GPU	7	1477	February 22, 2022
Convert torch tensor in julia type with PyCall.jl General Usage question	2	759	May 12, 2022
Help using Python in Julia with PythonCall and DLPack General Usage	2	238	December 8, 2022
[ANN] DLPack.jl - Share CPU and CUDA arrays between Julia and Python Package Announcements interoperability , cuda , python , pytorch , jax	0	982	February 21, 2022
Julia with CuArray issue New to Julia cuda , cudajl	2	170	July 1, 2024

Help converting a Pytorch tensor to Julia CuArray

Related topics