Structured array in PyCall

I need to interface with a Python API which returns a structured array. Currently PyCall just returns PyObject array(..., dtype=...). Is it reasonably possible to convert it to some Julia-native format, like array of namedtuples or similar?

Simple example:

using PyCall
py"np.array([('Rex', 9, 81.0), ('Fido', 3, 27.0)], dtype=[('name', 'U10'), ('age', 'i4'), ('weight', 'f4')])"
# gives PyObject array([('Rex', 9, 81.), ('Fido', 3, 27.)], dtype=[('name', '<U10'), ('age', '<i4'), ('weight', '<f4')])
2 Likes

I don’t think PyCall directly supports it. But you can manually load the data as an array-of-(named)tuples.

julia> py"""
       import numpy
       a = numpy.array([('Rex', 9, 81.0), ('Fido', 3, 27.0)], dtype=[('name', 'U10'), ('age', 'i4'), ('weight', 'f4')])
       """

julia> p = py"a.ctypes.data"
93965367751616

julia> a = unsafe_wrap(Array, Ptr{NamedTuple{(:name, :age, :weight), Tuple{NTuple{10, UInt32}, Int32, Float32}}}(p), 2)
2-element Array{NamedTuple{(:name, :age, :weight),Tuple{NTuple{10,UInt32},Int32,Float32}},1}:
 (name = (0x00000052, 0x00000065, 0x00000078, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000), age = 9, weight = 81.0)
 (name = (0x00000046, 0x00000069, 0x00000064, 0x0000006f, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000), age = 3, weight = 27.0)

julia> a[1]
(name = (0x00000052, 0x00000065, 0x00000078, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000), age = 9, weight = 81.0f0)

julia> a[1].age
9

julia> a[2].weight
27.0f0

julia> s = transcode(String, collect(a[1].name))
"Rex\0\0\0\0\0\0\0"

julia> s[1:first(findfirst("\0", s))-1]  # maybe there is a better way?
"Rex"

If you are going this way, have a look at unsafe_wrap documentation and understand the caveats.

2 Likes