Should `reshape` have an option for row-major order?

Currently reshape can only rearrange the raw linearly ordered data into Julia’s default column-major order (or colexicographical order, for higher-dimensional arrays). Would it be worth adding an optional keyword argument that reshapes the data in row major (or generally lexicographical) order? This would be convenient e.g. for interfacing with data generated by or for other programming languages that use the row-major convention (e.g. here). It’s fairly easy to reshape an array into an n \times m row-major matrix by reshaping it into an m \times n column-major matrix and then taking the transpose, but this gets cumbersome for higher-dimensional arrays.

(Note that I’m only suggesting changing the visible behavior of reshape, not the way that the resulting array is stored internally.)

permutedims?

1 Like

Okay, fair, I guess permutedims(reshape(A, p, o, n, m), [4, 3, 2, 1]) isn’t that much more cumbersome than reshape(A, m, n, o, p, ordering = rowmajor) would be. Maybe my suggestion isn’t necessary.

I think it is still a valid point. Although permutedims(reshape(A, p, o, n, m), [4, 3, 2, 1]) is nice, it is conceptually different from the order="row" version in that one still has to “think” for a second.

I would, besides order="row" and order="col", even suggest to follow numpy’s example and add something like order="C" and order="F" or order="Fortran" etc. for most common languages. I found those options really convenient.

4 Likes

The key about reshape is that it does not rearrange the data, it just presents it differently. It even shares structure:

julia> A = ones(4)
4-element Array{Float64,1}:
 1.0
 1.0
 1.0
 1.0

julia> B = reshape(A, 2, :)
2×2 Array{Float64,2}:
 1.0  1.0
 1.0  1.0

julia> B[2,2] = 3.0
3.0

julia> A
4-element Array{Float64,1}:
 1.0
 1.0
 1.0
 3.0

What you are asking for is a useful operation (especially if interfacing with languages that use row-major), but it may be better to give it a different name.

4 Likes

I agree that “rearrange the data” was a poor choice of works in my OP. But couldn’t the current implementation of reshape and my proposed “row-major reshape” simply be two different ways to “present the data differently”? I don’t see why they are conceptually different enough to require two different names.

1 Like

As @Tamas_Papp showed, currently reshape always returns a view on the same data. The behavior you request (changing to row-major order) requires allocating a new copy. This is a significant change which deserves a separate function IMHO.

Not efficiently, no. In principle a “view” could be created for permuted dimensions, but it is vastly more complicated than a reshape.

You can’t change to row-major order just by changing the array strides? For example, if your raw data has six elements, then (schematically) setting rowstride = 1, columnstride = 3 gives a column-major 3x2 matrix, rowstride = 1, columnstride = 2 gives a column-major 2x3 matrix, rowstride = 2, columnstride = 1 gives a row-major 3x2 matrix, and rowstride = 1, columnstride = 3 gives a row-major 2x3 matrix.

reshape also preserves linear indexing, ie

reshape(A, ...)[i] == A[i]

for all i in 1:length(A).

Again, what you are asking for is a perfectly reasonable feature, but you should call it something else.

numpy:

This will be a new view object if possible; otherwise, it will be a copy. Note there is no guarantee of the memory layout (C- or Fortran- contiguous) of the returned array.

PermutedDimsArray does create a view; permutedims is the “eager” version which creates a copy.

@carstenbauer, Julia developers have tended to view “it might be A or it might be B depending on circumstance” as less than desirable; for example if you call those functions and then set elements in the resulting array, what will happen? Consequently among Julia devs there has been a drive to be 100% predictable, and it’s currently viewed as a bug when it’s not.

4 Likes

@tim.holy Yes, I know and couldn’t agree more. I like this viewpoint very much. I was more citing this for completeness. I looked it up because I was curious how numpy handles this issue.

1 Like

So, we could either do something similar to this

change_major_order(X::AbstractArray, sizes...=size(X)...) = permutedims(reshape(X, sizes...), length(sizes):-1:1)

which performs the multidimensional transpose (as proposed above), or go with functions like row_major(X, sizes...), col_major(X, sizes...) which always brings X into row/column-major order.

In any case, we should certainly mention in the doc that none of this changes the actual structure in memory.

I am not sure what you mean here. permutedims does rearrange the elements (of the copy). To see this, call vec on the result.

Sorry, what I meant was that the final Array is always row-major column-major order in the sense that Julia always uses row-major column-order internally. If I switch to column-major row-major order by using change_major_order the Array will look/be like expected, what has been a row before is now a column, but internally I still have to iterate over rows first columns to be fast.

julia> A = rand(2,2)
2×2 Array{Float64,2}:
 0.48992   0.135993
 0.856454  0.625905

julia> A_row_order = change_major_order(A)
2×2 Array{Float64,2}:
 0.48992   0.856454
 0.135993  0.625905

julia> A_row_order[1] = 123
123

julia> A
2×2 Array{Float64,2}:
 0.48992   0.135993
 0.856454  0.625905

It does produce a copy but A_col_order A_row_order is still stored in column-major order. I am still fast if I iterate over rows first columns.

Actually, it’s the opposite, Julia is always column-major (Fortran) order internally.

Argh, thanks, too much switching between orders. But my statement holds :slight_smile:
I update my post to not confuse anyone else.

@carstenbauer Sorry to nitpick, but for completeness you should edit post #16 to change both mentions of “iterate over rows” to “iterate over columns”. Also, note that your proposal in post #14 isn’t quite right; you need to reverse the desired sizes before the transpose so that they end up right after the transpose. So it would need to be something more like

change_major_order(X::AbstractArray, size...=size(X)...) = permutedims(reshape(X, reverse([size...])...), length(size):-1:1)

No need to be sorry, in fact, I should be sorry :slight_smile:
By “iterate over rows first” I was thinking of two for loops and the inner one should be the row index. But indeed, this is iterating columns. Changed it. Your second point is also right.

But, overlooking my incompetence, what about change_major_order(X) vs row_major(X) and col_major(X)?