Synposis
I am pleased to announce NumPyArrays.jl at version v0.1.1. Initial registration is in the general registry is in process.
This package facilitates the conversion of certain Julia SubArrays, Base.ReinterpretedArrays, Base.ReshapedArray, and PermutedDimsArrays into NumPy arrays without copying if they have a mutable parent or ancestor. It also potentially allows other arrays where strides is applicable if they have a mutable parent or ancestor.
PyCall.jl
Currently, PyCall.jl will happily convert an Array into a numpy.ndarray.
julia> using PyCall
julia> A = rand(UInt8, 4, 4)
4×4 Matrix{UInt8}:
0x39 0x94 0xb0 0x0e
0x96 0x92 0xf6 0xbe
0x29 0x7f 0x84 0xbc
0x12 0x29 0xea 0xc0
julia> pyA = PyObject(A)
PyObject array([[ 57, 148, 176, 14],
[150, 146, 246, 190],
[ 41, 127, 132, 188],
[ 18, 41, 234, 192]], dtype=uint8)
julia> pytypeof(pyA)
PyObject <class 'numpy.ndarray'>
However, PyCall.jl does not do no-copy conversions of all possible Julia arrays produced by Base.
Converting arrays produced by @view or reinterpret into a numpy array
If you use @view or reinterpret the resulting Julia arrays are converted by PyCall to Python lists by copying.
julia> sA = @view A[2:3,2:3]
2×2 view(::Matrix{UInt8}, 2:3, 2:3) with eltype UInt8:
0x92 0xf6
0x7f 0x84
julia> rA = reinterpret(Int8, sA)
2×2 reinterpret(Int8, view(::Matrix{UInt8}, 2:3, 2:3)):
-110 -10
127 -124
julia> PyObject(sA)
PyObject [[146, 246], [127, 132]]
julia> pytypeof(ans)
PyObject <class 'list'>
julia> PyObject(rA)
PyObject [[-110, -10], [127, -124]]
julia> pytypeof(ans)
PyObject <class 'list'>
NumPyArrays.jl facilitates these conversions.
julia> using NumPyArrays
julia> npsA = NumPyArray(sA)
2×2 NumPyArray{UInt8, 2}:
0x92 0xf6
0x7f 0x84
julia> pytypeof(npsA)
PyObject <class 'numpy.ndarray'>
julia> nprA = NumPyArray(rA)
2×2 NumPyArray{Int8, 2}:
-110 41
127 -80
julia> pytypeof(nprA)
PyObject <class 'numpy.ndarray'>
Compatibility with PyCall.PyObject and PyCall.PyArray
A NumPyArray is easily converted into a PyObject or PyArray making it compatible with much of the PyCall API. Internally, NumPyArray just wraps a PyArray.
julia> PyObject(nprA)
PyObject array([[-110, 41],
[ 127, -80]], dtype=int8)
julia> PyArray(nprA)
2×2 PyArray{Int8, 2}:
-110 41
127 -80
julia> fieldnames(typeof(nprA))
(:pa,)
julia> nprA.pa
2×2 PyArray{Int8, 2}:
-110 41
127 -80
julia> fieldnames(typeof(nprA))
(:pa,)
julia> nprA.pa
2×2 PyArray{Int8, 2}:
-110 41
127 -80
julia> py"Main.nprA + 1"
2×2 Matrix{Int8}:
-109 42
-128 -79
julia> py"Main.rA + 1"
ERROR: PyError ...
PermutedDimsArray and Base.ReshapedArray
As I mentioned above, NumPyArrays.jl also facilitates the conversion of PermutedDimsArray and Base.ReshapedArray:
julia> pdA = PermutedDimsArray(A, [2,1])
4×4 PermutedDimsArray(::Matrix{UInt8}, (2, 1)) with eltype UInt8:
0x39 0x96 0x29 0x12
0x94 0x92 0x7f 0x29
0xb0 0xf6 0x84 0xea
0x0e 0xbe 0xbc 0xc0
julia> pytypeof(PyObject(pdA))
PyObject <class 'list'>
julia> nppdA = NumPyArray(pdA)
4×4 NumPyArray{UInt8, 2}:
0x39 0x96 0x29 0x12
0x94 0x92 0x7f 0x29
0xb0 0xf6 0x84 0xea
0x0e 0xbe 0xbc 0xc0
julia> pytypeof(nppdA)
PyObject <class 'numpy.ndarray'>
julia> py"Main.nppdA + 1"
4×4 Matrix{UInt8}:
0x3a 0x97 0x2a 0x13
0x95 0x93 0x80 0x2a
0xb1 0xf7 0x85 0xeb
0x0f 0xbf 0xbd 0xc1
julia> rsA = Base.ReshapedArray(A, (2,8), ())
2×8 reshape(::Matrix{UInt8}, 2, 8) with eltype UInt8:
0x39 0x29 0x94 0x7f 0xb0 0x84 0x0e 0xbc
0x96 0x12 0x92 0x29 0xf6 0xea 0xbe 0xc0
julia> pytypeof(PyObject(rsA))
PyObject <class 'list'>
julia> nprsA = NumPyArray(rsA)
2×8 NumPyArray{UInt8, 2}:
0x39 0x29 0x94 0x7f 0xb0 0x84 0x0e 0xbc
0x96 0x12 0x92 0x29 0xf6 0xea 0xbe 0xc0
julia> pytypeof(nprsA)
PyObject <class 'numpy.ndarray'>
Limitations
Not all of the above array types can be directly converted into a numpy array. Sometimes a copy still needs to be made. NumPyArrays may provide some limited support for these kind of conversions in the future.
julia> B = reshape(1:16,4,4)
4×4 reshape(::UnitRange{Int64}, 4, 4) with eltype Int64:
1 5 9 13
2 6 10 14
3 7 11 15
4 8 12 16
julia> typeof(B)
Base.ReshapedArray{Int64, 2, UnitRange{Int64}, Tuple{}}
julia> NumPyArray(B)
ERROR: Only AbstractArrays where strides is applicable can be converted to NumPyArrays.
...
julia> npcB = NumPyArray(copy(B))
4×4 NumPyArray{Int64, 2}:
1 5 9 13
2 6 10 14
3 7 11 15
4 8 12 16
Implementation Details
The three basic requirements for a no copy conversion into a NumPy array are
-
pointeris applicable -
stridesis applicable - Either the array itself is mutable or it has a mutable
parent.
Currently, PyCall.jl only allows Base.StridedArray and several other array types such as LinearAlgebra.Adjoint to be converted directly into NumPy arrays without copying. This package loosens this to apply to AbstractArrays where strides is applicable.
More importantly it also allows immutable arrays with a mutable parent to become NumPy arrays. This limitation is due to PyCall’s facility to manage garbage collection.
This approach originates from PyCall.jl PR #876