Why is `transpose` recursive?

The deprecation of transpose(::Any) and its impact on the ability to use transpose for generic matrices has confused people here at Invenia. From a usage standpoint, we understand that permutedims is the way forward. However, I’ve been trying to discover why transpose is called recursively on a matrix’s elements by looking through the relevant GitHub issues and have come up empty. In our elementary understanding of linear algebra we can’t find a reason why this should be the case.

Can someone please explain why it is more correct to have transpose be recursive?

Thanks,
Eric

6 Likes

Recursive transpose was originally implemented for (and makes sense for, IMO) block matrices: https://github.com/JuliaLang/julia/pull/7244.

My understanding is that since very early on, this has been defined to allow for a very simple definition of matrices in block form.

E.g. it is relatively common in linear algebra to express given matrices A, B, C, D of compatible sizes,

M = [A B;
     C D]

Now we can define things like M' * M and get the expected results, where

M' == [A' C';
       B' D']

On the other hand, I’m not sure this functionality has been fully followed through on in Base. For example, an extremely obvious use case is block-diagonal matrices using Diagonal, but transpose(::Diagonal) is not recursive!

I was actually hoping we could clear this up sometime in the next release cycle - do we want to more fully support block-diagonal matrices using nested arrays? Or is it, upon hindsight, better to rely on specific wrapper types (might be BlockArray, BlockMatrix, BlockVector) which present a “flattened” view of the matrix for which recursion is implied (and remove recursion from the standard definition)?

3 Likes

(I have previously wondered if we should go for the flattened view like

struct BlockArray{T, N, A <: AbstractArray{<:AbstractArray{T, N}, N}}
    blocks::A
end

where getindex(blockarray::BlockArray, i...) returns an element of the flattened array, and you can manipulate the blocks directly via blockarray.blocks. In this case it is clear that transpose is recursive on blockarray.blocks.)

2 Likes

The question has already been answered: this is the correct behavior for block matrices. The price is a separation between transpose and permutedims where the former is a linear algebra operation whereas the latter is a general array operation. Personally, I also think it is unfortunate to use ' for non-linear algebra operations like e.g. string arrays.

1 Like

Is this price worth paying? How often does one do represent block matrices with matrices of matrices? And wouldn’t a custom type as Andy’s proposed be better? It might be better to have ' and .' not be recursive and be able to use them on non-numerical arrays.

Moreover, it’s unclear to me that recursive transposition is correct under other arguably more natural interpretations of a matrix of matrices. For example, what if I want to do linear algebra in the ring of 2×2 real matrices? A linear transform in this space is naturally a matrix of matrices and recursive transposition is the incorrect behavior in this case as far as I can tell. This strikes me as a more natural meaning for a matrix of matrices than interpreting it as a block representation of a real-valued matrix.

If one wants to use a matrix of matrices block representation of a single matrix, then a new matrix type would seem to be in order – a type for which transposition could be defined to be recursive, without burdening the rest of the language.

14 Likes

I agree that this sounds like stuff specifically for block matrices, so it sounds like it should be dispatches for a block matrix type.

1 Like

Are strings not valid algebraic objects? If you define a ring on regular patterns where * is concatenation – as it already is in Julia for strings, but extended to patterns – and define + as alternation of patterns – i.e. “a” + “b” => a|b in regular expression terms – then strings make perfectly legitimate linear algebraic entities, albeit not numerical ones. In that case, transposing an array of them should do precisely what it used to do before ' was disallowed for non-numeric arrays.

1 Like

Honestly, I think this is a strawman. Right now, we don’t define operations on strings such that you can do any useful linear algebra on them. But more importantly, the use cases for transposing string arrays seen so far have all been array operations where it was considered useful to have a 1xn string array instead of a column. E.g. for organizing string output. For those use cases, I don’t think the mathy transpose is to be preferred.

If we really want to support more math operations for strings then we should just define transpose(x::String) = x since it would be the correct definition. I’m not convinced that people really wants to do linear algebra with our strings.

Is this price worth paying?

That was our conclusion. It is not that this wasn’t known at the time we made the decision.

Moreover, it’s unclear to me that recursive transposition is correct under other arguably more natural interpretations of a matrix of matrices. For example, what if I want to do linear algebra in the ring of 2×2 real matrices? A linear transform in this space is naturally a matrix of matrices and recursive transposition is the incorrect behavior in this case as far as I can tell.

If this is really the case then, then it is worth reconsidering. The current behavior was considered “the right one” from a math point of view. Just woke up, so I’ll have to think a little longer before I can comment on your specific example.

I like recursive transpose as i even think of transpose of a complex or quaternion scalar as being
conjugate. I’d like to be able to transpose arrays of strings.

Just to note a simple and prevalent case this comes up in. For some plotting libraries, you need to give the legend as a row vector of strings. Each column matches the legend to the line (when plotting a matrix, each column is a line). If you hardcode ["leg1" "leg2" "leg3"], you’re fine. But, if you have a vector of strings, strvec' will error. The workaround is reshape(strvec,1,length(strvec)), but it’s odd that this is so complicated.

7 Likes

I like recursive transpose as i even think of transpose of a complex or quaternion scalar as being conjugate. I’d like to be able to transpose arrays of strings.

Conjugation is an entirely different story – ctranspose of a matrix should definitely call conjugate on its elements (it would be incorrect otherwise). It would be helpful if you could outline how quaternions relate to this in more detail. I’m still convinced that this behavior was decided on based entirely on an uncommon representation of matrices that would be better served by a custom type.

2 Likes

I don’t think conjugation is as entirely different as it would appear.
Exmaple if you use 2x2 real matrices for complex numbers, then
transpose and conjugation are isomorphic.

I would use ctranspose for all the way down and transpose for not all the way down.
perhaps different names are better

As a spitball: ‘recursive’ is broad, and ‘transpose’ is just one application. Since they’re orthogonal, would it be a useful api to explosive the following to the user?

transpose(A) # not recursive
recursive(transpose)(A) # recursive!
recursive(f)(A) # for any matrix function f

The function recursive takes a function f and returns recursive(f), which is a function that does a super-simple check on the elements of A – if they are arrays or tuples or whatever, apply recursive(f) again, and if they are scalars, leave them be, and finally apply f on the result.

That’s the rub, and it is precisely why transpose is now annoying to use for non-numeric types. What makes a scalar? Numbers clearly are, and arrays clearly are not, but what about other things? Make (c)transpose less error-prone · Issue #13171 · JuliaLang/julia · GitHub

+1 for making ' non-recursive. The use case of using it as an alias for permutedims vs matrix-of-matrices block representations in user code must be at least 100:1.

7 Likes

' is probably the oddest syntax in the language and is allowed syntax because of the common use of this notation in linear algebra (and Matlab). Making it do the wrong thing for a linear algebra operation for enabling a short alias for reshaping non-numerical arrays seems, as convenient as it might be, to me to be the wrong solution.

I also don’t think that ' is particularly easy to read in a non-linear algebra context. In contrast, the meaning of reshape is quite easy to understand but slightly longer to type.

I also don’t think that ’ is particularly easy to read in a non-linear algebra context. In contrast, the meaning of reshape is quite easy to understand but slightly longer to type.

I do think this is a very important point. The amount of MATLAB pain (bugs and incomprehensible code) I’ve seen over the years from this… yikes. I always recommend people use reshape and permutedims unless it is linear algebra. Apart from the fact that it’s small enough to overlook, I think the root issue is that ' is a toggle… some one-way convenience functions like vec (colvec?), rowvec, colmatrix and rowmatrix to cast compatible things to the correct shape could be both more readable and shorter than, e.g., reshape(vec, (1, length(vec)).

Nonetheless, I’d point out that making transpose non-recursive means you can opt-in to recursion using BlockedArray (or whatever). At the moment, there is no mechanism to opt-out of recursion.

2 Likes

Another advantage of removing recursion here is that it’s one way to get rid of reliance on inference for the element type of RowVector (currently RowVector{T, V <: AbstractVector} gets T from Core.Inference.return_type(transpose, Tuple{eltype(V)}). And the same will go for a future TransposedMatrix. Removing reliance of inference behavior from run-time behavior seemed to be a priority of late.

(Note: this is probably not the only way to remove inference here, e.g. letting the user set T and making an assertion within getindex could be another option, but there still needs to be a “default” T and the only semi-workable option is T = eltype(V) which would fail for recursive block vectors/matrices anyway!).