Converting Julia arrays, views to NumPy arrays via PyCall

I am trying to make some Julia code work with Python, and I am having some trouble understanding why certain kind of Julia arrays are able to translate to NumPy arrays. A regular Julia Array seems to get translated just fine, but if I try to use @view or reinterpret I get a Python list rather than a numpy.ndarray.

Am I doing something wrong?

julia> A = rand(Int,256);

julia> pyA = PyObject(A);

julia> pytypeof(pyA)
PyObject <class 'numpy.ndarray'>

julia> pyA_1_to_100 = PyObject(A[1:100]);

julia> pytypeof(pyA_1_to_100)
PyObject <class 'numpy.ndarray'>

julia> pyA_1_to_100_view = PyObject(@view(A[1:100]));

julia> pytypeof(pyA_1_to_100_view)
PyObject <class 'list'>

julia> pyA_reinterpret = PyObject(reinterpret(UInt,A));

julia> pytypeof(pyA_reinterpret)
PyObject <class 'list'>
1 Like

No, you’re not doing anything wrong. Currently PyCall does not support the conversion of SubArray or Base.ReinterpretArray to NumPy arrays.

julia> typeof(@view(A[1:100]))
SubArray{Int64,1,Array{Int64,1},Tuple{UnitRange{Int64}},true}

julia> typeof(reinterpret(UInt,A))
Base.ReinterpretArray{UInt64,1,Int64,Array{Int64,1}}

I think it is possible to convert more standard Julia array types to NumPy arrays. I’ve created this pull request to try to apply this to types that implement strides:
https://github.com/JuliaPy/PyCall.jl/pull/876

julia> applicable(strides, @view(A[1:100]))
true

julia> applicable(strides, reinterpret(UInt,A))
true

I’m not sure what the status of the pull request is. Perhaps @stevengj could comment on the state of the PR.

Thanks. While we are waiting on the PR to be reviewed or merged, is there any possible workarounds that the user can do? If the PR proves impossible, would it be possible to put these features into a separate package?

The easiest way would be to use standard Julia arrays, but this may be problematic if they are GB-sized or in a very hot loop.
Or is it possible to move the whole calculation to Julia, removing the need for PyCall?

1 Like

I’m moving large images around, so copying data would be quite problematic. Looking at the PR that @mkitti mentioned, it looks possible to override some of the few PyCall methods to achieve similar functionality.

In particular, if one implemented NpyArray(a::AbstractArray{T}, revdims::Bool) where T<:PYARR_TYPES and pyembed(po::PyObject, jo::Any) for SubArray and Base.ReinterpretArray then maybe it will work?

You might like to try my package GitHub - cjdoris/PythonCall.jl: Python and Julia in harmony., all mutable objects do non-copying conversion to Python and any strided array is usable as a numpy array.

3 Likes

I may need to consider switching to PythonCall for Napari.jl soon for this feature.

Ok, here’s the self contained hack:

julia> using PyCall

julia> A = rand(Int, 256);

julia> pytypeof( PyObject(reinterpret(UInt64, A)) )
PyObject <class 'list'>

julia> pytypeof( PyObject(@view(A[1:100])) )
PyObject <class 'list'>

julia> module PyCallHack
           import PyCall: NpyArray, PYARR_TYPES, @npyinitialize, npy_api, npy_type
           import PyCall: @pycheck, NPY_ARRAY_ALIGNED, NPY_ARRAY_WRITEABLE, pyembed
           import PyCall: PyObject, PyPtr
           const HACKED_ARRAYS = Union{SubArray{T}, Base.ReinterpretArray{T}, Base.ReshapedArray{T}, Base.PermutedDimsArray{T}} where T <: PYARR_TYPES
           function NpyArray(a::HACKED_ARRAYS{T}, revdims::Bool) where T <: PYARR_TYPES
               @npyinitialize
               size_a = revdims ? reverse(size(a)) : size(a)
               strides_a = revdims ? reverse(strides(a)) : strides(a)
               p = @pycheck ccall(npy_api[:PyArray_New], PyPtr,
                   (PyPtr,Cint,Ptr{Int},Cint, Ptr{Int},Ptr{T}, Cint,Cint,PyPtr),
                   npy_api[:PyArray_Type],
                   ndims(a), Int[size_a...], npy_type(T),
                   Int[strides_a...] * sizeof(eltype(a)), a, sizeof(eltype(a)),
                   NPY_ARRAY_ALIGNED | NPY_ARRAY_WRITEABLE,
                   C_NULL)
              return PyObject(p, a)
           end
           pyembed(po::PyObject, jo::HACKED_ARRAYS) = pyembed(po, jo.parent)
       end
Main.PyCallHack

julia> pytypeof( PyObject(reinterpret(UInt64, A)) )
PyObject <class 'numpy.ndarray'>

julia> pytypeof( PyObject(@view(A[1:100])) )
PyObject <class 'numpy.ndarray'>
1 Like