Hello,
I have a PyCall-wrapped python module (musdb) that natively gives me PyArray{Float64}
, which can be fairly large.
At some stage I need an Array{Float32} of this, but conversion times vary a lot:
@elapsed convert(Array{Float32}, pa) ## 15.8
@elapsed convert(Array{Float32}, convert(Array{Float64}, pa)) ## 3.1
@elapsed convert(Array{Float32}, view(pa, :, :)) ## 0.063
PyArray
is a view to the underlying python object, but apparently an explicit view()
around it makes it more performant.
—david
The PyArray
type was implemented a fairly long time ago, before all of the IndexStyle
stuff in Base; it could be that the convert
routine is somehow using linear indexing with PyArray
, which will be slow since it uses ind2sub
, rather than the newer CartesianIndex
loops that are used by SubArray
?
It would be interesting to drill down (with @which
or @edit
) to find what methods are being called by the convert
routine, and what additional IndexStyle
(or whatever) methods could be defined for PyArray
to make it switch over to the faster path apparently used by SubArray
. A PR would be welcome.
OK, thanks, I might have a look into this, but this would require quite a bit of study on my side.