[ANN] Announcing NumPyArrays.jl

Synposis

I am pleased to announce NumPyArrays.jl at version v0.1.1. Initial registration is in the general registry is in process.

This package facilitates the conversion of certain Julia SubArrays, Base.ReinterpretedArrays, Base.ReshapedArray, and PermutedDimsArrays into NumPy arrays without copying if they have a mutable parent or ancestor. It also potentially allows other arrays where strides is applicable if they have a mutable parent or ancestor.

PyCall.jl

Currently, PyCall.jl will happily convert an Array into a numpy.ndarray.

julia> using PyCall

julia> A = rand(UInt8, 4, 4)
4×4 Matrix{UInt8}:
 0x39  0x94  0xb0  0x0e
 0x96  0x92  0xf6  0xbe
 0x29  0x7f  0x84  0xbc
 0x12  0x29  0xea  0xc0

julia> pyA = PyObject(A)
PyObject array([[ 57, 148, 176,  14],
       [150, 146, 246, 190],
       [ 41, 127, 132, 188],
       [ 18,  41, 234, 192]], dtype=uint8)

julia> pytypeof(pyA)
PyObject <class 'numpy.ndarray'>

However, PyCall.jl does not do no-copy conversions of all possible Julia arrays produced by Base.

Converting arrays produced by @view or reinterpret into a numpy array

If you use @view or reinterpret the resulting Julia arrays are converted by PyCall to Python lists by copying.

julia> sA = @view A[2:3,2:3]
2×2 view(::Matrix{UInt8}, 2:3, 2:3) with eltype UInt8:
 0x92  0xf6
 0x7f  0x84

julia> rA = reinterpret(Int8, sA)
2×2 reinterpret(Int8, view(::Matrix{UInt8}, 2:3, 2:3)):
 -110   -10
  127  -124

julia> PyObject(sA)
PyObject [[146, 246], [127, 132]]

julia> pytypeof(ans)
PyObject <class 'list'>

julia> PyObject(rA)
PyObject [[-110, -10], [127, -124]]

julia> pytypeof(ans)
PyObject <class 'list'>

NumPyArrays.jl facilitates these conversions.

julia> using NumPyArrays

julia> npsA = NumPyArray(sA)
2×2 NumPyArray{UInt8, 2}:
 0x92  0xf6
 0x7f  0x84

julia> pytypeof(npsA)
PyObject <class 'numpy.ndarray'>

julia> nprA = NumPyArray(rA)
2×2 NumPyArray{Int8, 2}:
 -110   41
  127  -80

julia> pytypeof(nprA)
PyObject <class 'numpy.ndarray'>

Compatibility with PyCall.PyObject and PyCall.PyArray

A NumPyArray is easily converted into a PyObject or PyArray making it compatible with much of the PyCall API. Internally, NumPyArray just wraps a PyArray.

julia> PyObject(nprA)
PyObject array([[-110,   41],
       [ 127,  -80]], dtype=int8)

julia> PyArray(nprA)
2×2 PyArray{Int8, 2}:
 -110   41
  127  -80

julia> fieldnames(typeof(nprA))
(:pa,)

julia> nprA.pa
2×2 PyArray{Int8, 2}:
 -110   41
  127  -80

julia> fieldnames(typeof(nprA))
(:pa,)

julia> nprA.pa
2×2 PyArray{Int8, 2}:
 -110   41
  127  -80

julia> py"Main.nprA + 1"
2×2 Matrix{Int8}:
 -109   42
 -128  -79

julia> py"Main.rA + 1"
ERROR: PyError ...

PermutedDimsArray and Base.ReshapedArray

As I mentioned above, NumPyArrays.jl also facilitates the conversion of PermutedDimsArray and Base.ReshapedArray:

julia> pdA = PermutedDimsArray(A, [2,1])
4×4 PermutedDimsArray(::Matrix{UInt8}, (2, 1)) with eltype UInt8:
 0x39  0x96  0x29  0x12
 0x94  0x92  0x7f  0x29
 0xb0  0xf6  0x84  0xea
 0x0e  0xbe  0xbc  0xc0

julia> pytypeof(PyObject(pdA))
PyObject <class 'list'>

julia> nppdA = NumPyArray(pdA)
4×4 NumPyArray{UInt8, 2}:
 0x39  0x96  0x29  0x12
 0x94  0x92  0x7f  0x29
 0xb0  0xf6  0x84  0xea
 0x0e  0xbe  0xbc  0xc0

julia> pytypeof(nppdA)
PyObject <class 'numpy.ndarray'>

julia> py"Main.nppdA + 1"
4×4 Matrix{UInt8}:
 0x3a  0x97  0x2a  0x13
 0x95  0x93  0x80  0x2a
 0xb1  0xf7  0x85  0xeb
 0x0f  0xbf  0xbd  0xc1

julia> rsA = Base.ReshapedArray(A, (2,8), ())
2×8 reshape(::Matrix{UInt8}, 2, 8) with eltype UInt8:
 0x39  0x29  0x94  0x7f  0xb0  0x84  0x0e  0xbc
 0x96  0x12  0x92  0x29  0xf6  0xea  0xbe  0xc0

julia> pytypeof(PyObject(rsA))
PyObject <class 'list'>

julia> nprsA = NumPyArray(rsA)
2×8 NumPyArray{UInt8, 2}:
 0x39  0x29  0x94  0x7f  0xb0  0x84  0x0e  0xbc
 0x96  0x12  0x92  0x29  0xf6  0xea  0xbe  0xc0

julia> pytypeof(nprsA)
PyObject <class 'numpy.ndarray'>

Limitations

Not all of the above array types can be directly converted into a numpy array. Sometimes a copy still needs to be made. NumPyArrays may provide some limited support for these kind of conversions in the future.

julia> B = reshape(1:16,4,4)
4×4 reshape(::UnitRange{Int64}, 4, 4) with eltype Int64:
 1  5   9  13
 2  6  10  14
 3  7  11  15
 4  8  12  16

julia> typeof(B)
Base.ReshapedArray{Int64, 2, UnitRange{Int64}, Tuple{}}

julia> NumPyArray(B)
ERROR: Only AbstractArrays where strides is applicable can be converted to NumPyArrays.
...

julia> npcB = NumPyArray(copy(B))
4×4 NumPyArray{Int64, 2}:
 1  5   9  13
 2  6  10  14
 3  7  11  15
 4  8  12  16

Implementation Details

The three basic requirements for a no copy conversion into a NumPy array are

  1. pointer is applicable
  2. strides is applicable
  3. Either the array itself is mutable or it has a mutable parent.

Currently, PyCall.jl only allows Base.StridedArray and several other array types such as LinearAlgebra.Adjoint to be converted directly into NumPy arrays without copying. This package loosens this to apply to AbstractArrays where strides is applicable.

More importantly it also allows immutable arrays with a mutable parent to become NumPy arrays. This limitation is due to PyCall’s facility to manage garbage collection.

This approach originates from PyCall.jl PR #876

15 Likes

Update: NumPyArrays.jl is now in the General registry.

Are there any Julia arrays that you are you would like to see supported as their Python equivalents? Let me know.

This is very cool!

1 Like

Thanks. I was encouraged to release this since I saw others were interested.

1 Like