Currently reshape can only rearrange the raw linearly ordered data into Julia’s default column-major order (or colexicographical order, for higher-dimensional arrays). Would it be worth adding an optional keyword argument that reshapes the data in row major (or generally lexicographical) order? This would be convenient e.g. for interfacing with data generated by or for other programming languages that use the row-major convention (e.g. here). It’s fairly easy to reshape an array into an n \times m row-major matrix by reshaping it into an m \times n column-major matrix and then taking the transpose, but this gets cumbersome for higher-dimensional arrays.
(Note that I’m only suggesting changing the visible behavior of reshape, not the way that the resulting array is stored internally.)
Okay, fair, I guess permutedims(reshape(A, p, o, n, m), [4, 3, 2, 1]) isn’t that much more cumbersome than reshape(A, m, n, o, p, ordering = rowmajor) would be. Maybe my suggestion isn’t necessary.
I think it is still a valid point. Although permutedims(reshape(A, p, o, n, m), [4, 3, 2, 1]) is nice, it is conceptually different from the order="row" version in that one still has to “think” for a second.
I would, besides order="row" and order="col", even suggest to follow numpy’s example and add something like order="C" and order="F" or order="Fortran" etc. for most common languages. I found those options really convenient.
What you are asking for is a useful operation (especially if interfacing with languages that use row-major), but it may be better to give it a different name.
I agree that “rearrange the data” was a poor choice of works in my OP. But couldn’t the current implementation of reshape and my proposed “row-major reshape” simply be two different ways to “present the data differently”? I don’t see why they are conceptually different enough to require two different names.
As @Tamas_Papp showed, currently reshape always returns a view on the same data. The behavior you request (changing to row-major order) requires allocating a new copy. This is a significant change which deserves a separate function IMHO.
You can’t change to row-major order just by changing the array strides? For example, if your raw data has six elements, then (schematically) setting rowstride = 1, columnstride = 3 gives a column-major 3x2 matrix, rowstride = 1, columnstride = 2 gives a column-major 2x3 matrix, rowstride = 2, columnstride = 1 gives a row-major 3x2 matrix, and rowstride = 1, columnstride = 3 gives a row-major 2x3 matrix.
This will be a new view object if possible; otherwise, it will be a copy. Note there is no guarantee of the memory layout (C- or Fortran- contiguous) of the returned array.
PermutedDimsArray does create a view; permutedims is the “eager” version which creates a copy.
@carstenbauer, Julia developers have tended to view “it might be A or it might be B depending on circumstance” as less than desirable; for example if you call those functions and then set elements in the resulting array, what will happen? Consequently among Julia devs there has been a drive to be 100% predictable, and it’s currently viewed as a bug when it’s not.
@tim.holy Yes, I know and couldn’t agree more. I like this viewpoint very much. I was more citing this for completeness. I looked it up because I was curious how numpy handles this issue.
which performs the multidimensional transpose (as proposed above), or go with functions like row_major(X, sizes...), col_major(X, sizes...) which always brings X into row/column-major order.
In any case, we should certainly mention in the doc that none of this changes the actual structure in memory.
Sorry, what I meant was that the final Array is always row-major column-major order in the sense that Julia always uses row-major column-order internally. If I switch to column-major row-major order by using change_major_order the Array will look/be like expected, what has been a row before is now a column, but internally I still have to iterate over rows first columns to be fast.
@carstenbauer Sorry to nitpick, but for completeness you should edit post #16 to change both mentions of “iterate over rows” to “iterate over columns”. Also, note that your proposal in post #14 isn’t quite right; you need to reverse the desired sizes before the transpose so that they end up right after the transpose. So it would need to be something more like
No need to be sorry, in fact, I should be sorry
By “iterate over rows first” I was thinking of two for loops and the inner one should be the row index. But indeed, this is iterating columns. Changed it. Your second point is also right.
But, overlooking my incompetence, what about change_major_order(X) vs row_major(X) and col_major(X)?