What's the meaning of the array syntax , ;

Hello,

have been playing with Julia for a few weeks. (I have used MATLAB, but more of a Python background.) I find the following almost impossible to remember when typing arrays manually:

  • spaces yield horizontal arrays
  • commas yield vertical arrays
  • semicolons yield vertical arrays
  • semicolons and spaces yield 2x2 arrays
  • commas and spaces yield errors
  • commas with subarrays yield arrays-of-arrays
  • semicolons with subarrays yield flat arrays

c.f. https://github.com/JuliaLang/julia/blob/master/doc/src/manual/arrays.md

It’s hard to remember because I can’t see a philosophy. Commas and semicolons seem to just behave differently in ways that are not predictable (from remembering just 6 out of the above 7 facts, I can never deduce the 7th fact). Maybe semicolons are distinguished from commas by concatenating “more aggressively”, but if this is the only difference, why can’t matrices be declared as [1 2, 3 4]?

It creates confusion for me later in the documentation, because for later statements (e.g. “An array with a specific element type can be constructed using the syntax T[A, B, C, ...].”) I am not sure if it is supposed to be general, or refers to a particular property of , versus ;. (Of course, in this case it doesn’t, but as a beginner I have to test each case to check.)

What’s the logic distinguishing , from ; for array creation?

7 Likes

Think of commas as syntax for enumeration, and spaces and semicolons as concatenation operators.

julia> z = zeros(2,2)
2×2 Array{Float64,2}:
 0.0  0.0
 0.0  0.0

julia> o = ones(2,2)
2×2 Array{Float64,2}:
 1.0  1.0
 1.0  1.0

julia> [z o; o z]
4×4 Array{Float64,2}:
 0.0  0.0  1.0  1.0
 0.0  0.0  1.0  1.0
 1.0  1.0  0.0  0.0
 1.0  1.0  0.0  0.0
3 Likes

That describes some of the observed behaviour, thank you. But I am still wondering two things:

Firstly, why isn’t [z o, o z] given any meaning in your example? For example, it could give a 2x2 array of 2x2 arrays.

Secondly, what completes this table? We have comma, semicolon, and space, but no ‘horizontal enumerator’.

     | enum. | conc. |
----------------------
vert |   ,   |   ;   |
horz |  ???  |  [_]  |
1 Like

Possibly because enumeration is directionless: it gives flat structures, isomorphic to vectors. , is not a vertical separator, just a separator. Convention considers vectors “vertical”, but they are just vectors, directionless per se. In light of this, I don’t think your table is the right mental model.

[z o, o z] could be given meaning, but space/; is special syntax for a matrix expression and you can’t mix it with anything else (which IMO is a solid design choice). Try [[z o], [o z]].

6 Likes

That’s an interesting perspective and does help fill in the gap for me. But it makes me think that, if comma gives a vector and space gives a 1xn Array{T,2}, then ; would produce an nx1 Array{T,2}, so I still cannot completely predict the rules.

Regardless, it might be nice to spell out the meaning behind this design choice in the documentation more explicitly, or at least to clarify that ,; are interchangeable for vectors but only ; applies to matrices.

And I still feel that [z o, o z] could be a 2x2 array of 2x2 arrays. It has a certain logic, I see no drawback, and [[z o], [o z]] is very different in input and in output.

No, ; and , are not interchangeable for vectors:

julia> [[2];[3]]
2-element Array{Int64,1}:
 2
 3

julia> [[2],[3]]
2-element Array{Array{Int64,1},1}:
 [2]
 [3]

i.e. ; concatenates.

2 Likes

The interface is still in flux. See in particular

and for some history

and related

If you think you can contribute to better design, you should probably comment on Github.

1 Like

It looks like for many people, the preferred solution is to disentangle array construction and concatenation ops. I agree with that. That idea has not gained traction in some years, although it should happen ASAP before 0.6/1.0 if it will ever happen …

A smaller improvement to the current situation along the above lines could be to make ; return a nx1 Array. That would make more sense to me, but it then seems to suggest changing vcat to return an nx1 Array. Thoughts?

This has changed significantly in 0.4 and 0.5 (and possibly already in 0.3, I forget). That’s as fast as it could be changed without silently breaking people’s code. Changing syntax in a system that’s actively used by a lot of people is slow, delicate work. Using [ ] with , separators now only ever means array construction whereas ; and space as separators only mean vertical and horizontal concatenation, respectively (which is why you can’t mix , with ; or space). So this is now consistent once you understand what they mean,and it lets you do both kinds of operations.

I used to feel that we should get rid of space-sensitive syntax everywhere, but that was partly because there were occasional little bugs with it. It’s been a long time since there have been any such bugs and people seem to have gotten used to it – especially since the rules are now consistent everywhere there’s a space-sensitive context.

Are you suggesting that we stop using [ ] for array concatenation entirely? Or are there some corner cases of [ ] syntax that you’ve found annoying? Specifically, do you mean that you think [1;2;3] should produce a 3x1 matrix rather than a 3-element vector? I suppose that there’s a certain consistency to having all concatenation operations produce matrices, but that would leave us without the ability to write [v;w] to concatenate two vectors and get another vector, so there’s a tradeoff. If you’ve got a good argument for why it’s a better choice, I’d love to hear it.

2 Likes

What about embracing space-sensitive syntax fully?

Personally, what I find confusing is trying to remember that ; means vertical concatenation. One source of confusion is that [1,2,3] and [1;2;3] generate the same result, but for different reasons. Luckily, a newline works as a stand-in for ; and does pretty much exactly what you’d expect:

julia> a = [1 2]
1×2 Array{Int64,2}:
 1  2

julia> b = [3 4]
1×2 Array{Int64,2}:
 3  4

julia> [a b
        b a]
2×4 Array{Int64,2}:
 1  2  3  4
 3  4  1  2

In fact, one can avoid the whole confusion around , and ; entirely and just use whitespace instead (and if you are worried about taking up too many lines of code but still want to avoid the confusion, there’s always hcat and vcat).

No, I don’t want to suggest any really big changes, not now.

It would make more sense to me if ; produced an nx1 matrix instead of a vector. So the simple rule becomes: commas do vectors, semicolons and spaces concatenate and produce matrices.

That would entail the following changes.

  • vcat produces nx1 matrices.
  • hcat produces 1xn matrices
  • hvcat produces mxn matrices
  • a new function cat takes vectors and returns a vector

The idea is that hcat and vcat, with their direction explicitly included, only make sense in a 2D context, and it helps separate vectors and 1xn/nx1 matrices more meaningfully. The convenient [v;w] becomes cat(v,w) or cat([v,w]).

To me that’s cleaner, but maybe other beginners would be confused by distinguishing vectors/1xn,nx1 matrices this way. Would be interested to get other people’s ideas.

I can sympathize with this.

1 Like
  • vcat produces nx1 matrices.
  • hcat produces 1xn matrices

Minor point: vcat and hcat can produce arbitrary width and height matrices if their inputs are wide/tall respectively. E.g.:

julia> [rand(2,2); rand(3,2)]
5×2 Array{Float64,2}:
 0.187175   0.629967
 0.706591   0.938334
 0.781615   0.0100944
 0.430575   0.95716
 0.0332201  0.211566

julia> [rand(2,2) rand(2,3)]
2×5 Array{Float64,2}:
 0.00450999  0.636033  0.674548  0.369836  0.925724
 0.88356     0.163572  0.472557  0.960674  0.407421

I agree too that this you make more sense. Right now, it seems the recommended way to make an nx1 array is with the transpose operation, which could lead to unnoticed mistakes if you don’t realize that it gives a complex transpose by default:

julia> [1+2im 2+3im]'
2×1 Array{Complex{Int64},2}:
 1-2im
 2-3im

I was actually surprised at the current behaviour since the manual (very vaguely) implies that the semicolon gives “arrays” as opposed to the line right above that explain the comma gives “1-d arrays”. The special syntax table also gives the impression spaces and semicolons should output the same type of object.

Additionally you can create nx1 array from 1-d vector (or iterable) is by using hcat, eg.

hcat([1, 2, 3])
hcat(1:3)

That cheat sheet is incorrect.

A.'  # transpose of A
A'   # CONJUGATE transpose of A

(i.e. Same as Matlab)

PS there is a whole separate discussion about the merits of having a syntax use . in a non-broadcasty way.

x-ref: https://github.com/JuliaLang/julia/issues/19622

I have to say that I am not a fan of vcat returning a matrix even if the concatenated arrays are one-dimensional. I do use vectors more than matrices, and overall I think a one-dimensional vector is a more useful result type than an n×1 matrix in a greater number of situations.

1 Like