I’m in need of representing a row vector where the entries themselves are row vectors, e.g.
a, b = RowVector([1,2]),RowVector([3,4])
r = RowVector([a',b'])
(In practice, my a and b are special types that represent functionals, i.e., infinite-dimensional row vectors.)
The need to transpose a and b seems superfluous, and the name RowVector is confusing.
My proposal is to rename RowVector to TransposeVector and add a new implementation of RowVector that doesn’t transpose the entries, so that the above code could be written in two ways:
a, b = RowVector([1,2]),RowVector([3,4])
r = TransposeVector([a',b'])
r = RowVector([a,b])
I’m happy to make a pull request (I’ve added ApproxFun.RowVector to do what I want) but wanted to check first whether there is an argument against this change.
There have been discussions about changing this for v0.7/1.0 - particularly that transpose becomes non-recursive. OTOH, the last I heard, ctranspose may remain recursive.
I think it is hard to defend the complexity of having two implementations with slightly different behavior. It would be confusing for most users and be demanding to maintain so I think we should figure out if the current version needs to be adjusted a bit or if it is not worth it.
Hence, first of all, you think the current version is confusing and superflous but not wrong or makes it impossible for you to do what you want. Is that right?
I guess an alternative model could be that the behavior of (c)transpose remains recursive but RowVector isn’t recursive, i.e. something like
There is a tension here between two uses of transposition. One is simply a data rearrangement, divorced from linear algebra. i.e. just a special case of permutedims. The other is to construct an adjoint or dual vector. The latter meaning needs to be recursive to work properly.
For example, if you have a vector of functions (a Vector{Fun} in ApproxFun) the dual should be a “row vector” of linear functionals.
One possible compromise is the suggestion that transpose (x.') is mainly used for data rearrangement, and ctranspose (x') is mainly for the algebraic meaning (since this is the reason for the complex conjugation), so we could make the former non-recursive and keep the latter recursive.
I agree that the situation is annoying. But .' and ' already have subtly different meanings (the latter is conjugating for complex arrays), so maybe further distinguishing them is a good thing?
(Note that, in order to implement a recursive ctranspose and a non-recursive transpose, we will need a non-recursive type of RowVector anyway.)
I liked the proposal of renaming ctranspose as adjoint. Is that going ahead?
At the moment ApproxFun abuses notation and overrides ctranspose to mean differentiate, so your example wont work as expected. I’d propose adding an extra layer of indirection and have x' call tick(x) with the default implementation tick(x) = ctranspose(x).
One of the key goals here is to make x' cheaper and allocation-free.
Does Base have some sort of lazily-mapped array? This could replace ConjArray and help here with the recursive adjoint, for instance. It’s the cleanest solution I’ve thought of, anyway.
Sure, but we have a dot function that indicates the default inner product. As for adjoint vs dual, I agree it’s an ambiguity in standard notations that use the same symbol for both. For linear operators it is adjoint, and for vectors (in an inner product space) it is the dual. (Since scalar functions are not usually treated as linear operators, the notation should arguably refer to the dual in that case.)
OK, I’d propose the following for better consistency:
dual(a::Vector) = ConjRowVector(a)
adjoint(a::Vector) = error()
dual(A::Matrix) = error() # or special data type using Feobenius inner product to define dual
adjoint(A::Matrix) = ConjTransposeMatrix(a)
'(a::Vector) = dual(a)
'(A::Matrix) = adjoint(a)
That way in code if you mean dual you use dual and if you mean adjoint you use adjoint, and use ' to mean either, wheb you know its a matrix or vector.
In ApproxFun then I would have (here I assume continuous scalar-valued, but in practice it would be generalized to Array-valued)
dual(a::Fun) = DefiniteIntegral()[weight(space(a))*adjoint(a)]
adjoint(a::Fun) = conj(a) # identifies a with the multiplication operator
And I can carry on using ' to mean derivative while playing well with Base.
Using a postfix single quote mark to denote matrix conjugate transpose was one of the design decisions that sort of made sense when Matlab was invented, but doesn’t anymore (in my opinion). I guess they did it because the single quote is the closest thing to the dagger symbol that was available in ASCII. The fact that it also resembles the prime symbol that is used for derivatives didn’t matter at the time, because Matlab didn’t deal with derivatives.
There are several reasons why Julia shouldn’t copy the Matlab convention.
Julia is about much more than Matrix algebra. The f' notation for derivatives is as valid as the matrix transpose (if not more).
Julia is not limited to ASCII. Lots of functions and operators already have Unicode aliases.
Julia’s design philosophy seems to be all about avoiding special cases. Why make a special case of ' and .' as the only postfix operators?
The .' notation from Matlab clashes with Julia’s existing uses of dots. x.' could reasonably map to either getfield(x, :') or broadcast(', x) (if ' were allowed as a symbol).
I think Julia should either get rid of postfix operators altogether. Transpose could then be written e.g. using the exponentiation operator. Something like x^ᵀ would be easy to implement by making ᵀ a constant.
Or Julia could go big and make a whole range of Unicode characters into user-definable postfix operators. This would break some existing code, as people like to use the Unicode superscript characters in variable names, but it would allow for some cool notation like x² for x^2 and x° for deg2rad(x).
In either case, there would be enough possibilities available that each concept can get a unique notation.
When I think about it, the parser should not have to decide what Unicode characters are prefix or postfix operators. It is sufficient that it breaks up expressions like xᵀ into separate but adjacent symbols instead of combining them into one.
Then introduce a new function adjacent, so that xᵀ maps to adjacent(x,ᵀ). The default method should be adjacent(x,y) = x*y, so that 2π works as before. Let adjacent(x, f::Function) = f(x) so that xᵀ becomes ᵀ(x) and define that as transpose(x).
The adjacent “operator” would have higher precedence than *, so that A*Bᵀ becomes A*(Bᵀ) rather than (A*B)ᵀ.