Zero-overhead numpy array with juliacall

I want to expose a Julia package to Python users via juliacall. The function in the Julia code returns a 2D array. How can this be converted to a 2D numpy array with minimal overhead, e.g. without the cost of converting between column-major and row-major storage?

You can’t turn a row major into a column major array without cost if you want it to have the same shape. If you’re ok with flipping the dimensions then (i, j, k) in one storage order is the same as (k, j, i) in the other.

2 Likes

For 2D arrays specifically, the way to “zero-cost convert” between column- and row-major storage is to apply a lazy transpose the array (i.e., transpose(matrix)). For any number of dimensions, you can use a PermutedDimsArray wrapper and reverse the index order.

Although not all functions will work gracefully with this. Some might make a copy of the transpose before doing any work. And it may be that this simply doesn’t end up working nicely enough for you. It might be that copying the data is faster anyway, due to algorithmic inefficiencies or missing methods with the non-expected ordering.

EDIT: as pointed out, the challenge with your described workflow is that you’d need to return a transposed result from the Julia code so that it would be row-major such that it would be compatible with NumPy. Getting the result as a transpose is probably not easy to do. Consider also that a transposed-copy (or in-place transpose) is not an expensive operation, especially next to anything complicated that a package is probably doing. It may be inconsequential to simply do the copy.

It might be that copying the data is faster anyway

Yes, anything that is written for good cache locality will probably be quite a bit slower if you run the algorithm intended for column-major storage on a lazily transposed row-major array. That’s what I wanted to imply by “without cost”, which should not just include the cost of copying.

numpy.asarray(your_matrix) is a zero-copy wrapper around the original matrix data (assuming it’s a data type supported by Numpy).

It has the same majorness as the original matrix. If you really want to switch majorness, you can take a lazy transpose such as with numpy.asarray(your_matrix).T.

2 Likes

Numpy natively supports column-major arrays (they call it “Fortran order”), though it’s not the default.

1 Like

I’ll also add that Julia arrays in Python support the buffer protocol and the numpy array interface, so many functions accepting numpy arrays will also work with Julia arrays directly without needing to be wrapped as an actual Numpy array.

Thanks, it seems to work:

# Python REPL session
>>> from juliacall import Main as jl
>>> import numpy as np
>>> jl_mat = jl.seval("[1 2; 3 4]")
>>> jl_mat
Julia:
2Ă—2 Matrix{Int64}:
 1  2
 3  4
>>> py_mat = np.asarray(jl_mat)
>>> py_mat
array([[1, 2],
       [3, 4]])
>>> np.isfortran(py_mat)
True

The last line confirms that the array uses Fortran-style (i.e. column-major) memory layout.

1 Like