Why is the following behaviour not a bug?

colintbowers · October 28, 2020, 11:00am

I just observed the following on v1.4 (so apologies if it is gone on v1.5).

julia> x = [ rand(0:9, 2) for a = 1:3 ]
3-element Array{Array{Int64,1},1}:
 [3, 8]
 [2, 4]
 [3, 1]

julia> x[1,1]
2-element Array{Int64,1}:
 3
 8

julia> x[1]
2-element Array{Int64,1}:
 3
 8

Why does x[1,1] work at all, and given that it does work, why does it return the same thing as x[1]? And given that it does work, why does x[1,2] not work?

I’m sure there must be a reason but can’t for the life of me see what it is.

Cheers

Colin

oheil · October 28, 2020, 11:07am

There is a short note at the end of this chapter:
https://docs.julialang.org/en/v1/manual/arrays/#Omitted-and-extra-indices

You can in your example do:

julia> x[1,1,1,1,1]
2-element Array{Int64,1}:
 7
 3

DNF · October 28, 2020, 12:03pm

It can’t work for x[1,2] since that would require x to be a matrix with at least two columns, which it isn’t. The fact that you can append arbitrary numbers of 1s to the indices is indeed a bit confusing, given that there is a clear separation between vectors and matrices.

oheil · October 28, 2020, 12:40pm

Quoted from the docs:

Similarly, more than N indices may be provided if all the indices beyond the dimensionality of the array are 1 (or more generally are the first and only element of axes(A, d) where d is that particular dimension number).

This seems to be general enough to hold for every number of indices.

This allows vectors to be indexed like one-column matrices, for example

This the example use case.

The clear separation of vectors and matrices is existent in Julia but actually this fact is what I wonder about. I would expect to have the multidimensional array as the base for everything, in other words the general concept. But of course this would be academic and would hinder for best performance and memory efficiency.
So I see it this way: the gerenal concept is the multidimensional array and the pragmatic specializations are numbers (0-dim), vectors (1-dim) and matrices (2-dim). And in several packages there are more of these specializations.

Confusing? Only for mathematicians.

DNF · October 28, 2020, 12:47pm

Yes, the docs are clear enough, but it clashes with my explanation when I try to tell people that vectors are different from an Nx1 matrix, saying that there is no second dimension at all.

Maybe I misunderstand, but please, no. Vectors as Nx1 matrices is a terrible thing, in fact I think it may be my least favorite thing with Matlab(!) It causes endless troubles, and I’m not talking about performance.

Tamas_Papp · October 28, 2020, 12:57pm

This is a design choice that was taken very early in Julia and cannot be changed now as it would break the API, but with hindsight it is not so obvious that this was the best solution.

These things will be with us for a while, and may not be worth changing later on either, but it is OK to find them confusing. Julia, like every language, does have some warts, it is not necessary to rationalize them.

oheil · October 28, 2020, 1:07pm

Relax, nothing I want to change. It was only meant for the point of view on some things and what could confuse and what not. Of course, this is indivual taste and my taste of the point if view is first general multidimensional array and after that specialized implementations. So I am not confused about

julia> x[1,1,1,1,1]
2-element Array{Int64,1}:
 7
 3

DNF · October 28, 2020, 1:27pm

Since I spend so much time fuming over the lack of distinction between vectors and matrices in Matlab (it is my day job), I’m curious if you can explain this a little more closely. What does this mean, and is it different from Julia, and similar to Matlab?

oheil · October 28, 2020, 1:47pm

This I can’t answer as I do not have any experience with matlab.

For this it seems that you interprete too much into my remark. It was meant regarding to this statement quoted from docs:

As you find it confusing that x[1,1,1,1,1,...addmore] work with respect that vectors and matrices are separated concepts/implementations.

What I want to say is, when I read similar sentences in the docs my state of mind is, that I have a general NxNxNx…xN (multidimensional) array as a picture, and with this, above statement in the docs is in no way confusing. Thats all here.

I know, there are other languages, where it is different, where e.g. numbers are internally 0-dim arrays (is it haskell? I don’t know and it doesn’t matter, so I don’t look it up for now), and this may make sense for those languages, but of course, as @Tamas_Papp pointed out, it doesn’t make sense for Julia to have this type of academic stringency. And now I learned that Matlab seems not to separate 1-dim Arrays (vectors) from 2-dim arrays (matrices), may be it is not needed for Matlab to distinguish because the implementation (probably in C) would not benefit in terms of memory usage and performance.

Puh, many words about not so much I have to say about it But I hope, that OP understands why it isn’t a bug, but a feature.

Tamas_Papp · October 28, 2020, 1:50pm

I think you misunderstood my post: I am arguing that supporting the trailing 1s makes little sense with how Julia works currently. It is a historical artifact, it did make some algorithms easier initially.

I would prefer catching x::Vector[1,1] etc as an error. But, again, that would break the API now.

DNF · October 28, 2020, 1:52pm

Thanks for your reply. We were perhaps talking about different things (though I’m not completely certain.)

The problem with this in Matlab is really bad, not in terms of performance, but in terms of how it makes code a horrible mess. In my (very strong) opinion it was a very bad decision by Matlab.

But enough with the off-topic. Sorry.

oheil · October 28, 2020, 1:53pm

I see. But still it isn’t a bug and it is somehow well documented.

Here I don’t have any opinion. Is there something planed for Julia 2.0?

oheil · October 28, 2020, 1:54pm

No need for being sorry. I don’t think its off-topic, because everything may help OP to understand why something is as it is and how others think of it.

Tamas_Papp · October 28, 2020, 5:39pm

I don’t know where the threshold will be between revising minor warts (which not everyone considers warts, BTW) and keeping compatibility, but I suspect it will lean toward the latter. Cf

colintbowers · October 29, 2020, 3:32am

Ah. I did look at the docs before posting, but I missed that bit. Thanks for pointing it out.

colintbowers · October 29, 2020, 3:57am

This was an interesting discussion - thank you everyone for responding.

Personally I’m with Tamas and DNF in that I don’t like the idea of allowing trailing ones when indexing arrays. To be clear, I’m definitely not suggesting syntax changes at this stage of the language - I follow the discussions on this board enough to know that those types of posts don’t go down so well these days

I can perhaps add one point to the discussion which hasn’t been raised yet (and might be useful if this is ever discussed for v2.0). My logic for objecting to the syntax is that it can lead to accidental bugs when one mixes arrays of numbers with arrays of arrays of numbers in ones code. Personally, I often find myself in situations where I have x::Matrix{Float64} and y::Vector{Vector{Float64}}. Obviously for individual elements I index x[n,m] and y[n][m]. But sometimes I make a mistake and accidentally index y[n,m].

From my point of view, life would be nice if this mistake errored, rather than me having to work backwards from another error in a different part of the code. And it is not hard to see how mixing higher dimensional arrays of numbers with arrays of arrays of arrays of numbers, or arrays of higher dimensional arrays of numbers, could lead to similar bugs that are harder to trace back (or worse, don’t result in errors at all). Of course, at heart, these types of bugs are my fault, not the languages, since the original indexing error is done by me. And to be fair, it is so wonderfully easy in Julia to define my own types that wrap these objects but allow me to choose the indexing rules. But as we all know when coding, some mistakes are easier to make than others, and this feels like a particularly easy one to make to me.

HaoxuanGuo · October 29, 2020, 6:11am

Array{Int64, 2} and Array{Array{Int64, 1}, 1} are different datatype.

colintbowers · October 30, 2020, 2:08am

Yes understood. The discussion is about how the indexing rules (i.e. the getindex function) apply to these two types, and more generally, to higher-dimensional arrays of numbers vs arrays of arrays of numbers.

lmiq · October 31, 2020, 9:37am

Without breaking anything, perhaps there could be a macro that warns the developer of such thing. Like @code_warntype. Not related, but these days I used a filename that resulted to be forbidden in Windows. Someone used the term “linter” to suggest package that could track these small syntax issues and alert the programer.

There is one such linters already: Messages - Lint.jl

I am not sure if it is maintained. If so, that could be a pull request.

Edit: perhaps it supports that already: " * More indices than dimensions in an array lookup"

colintbowers · November 2, 2020, 12:00am

Nice find. It looks like the last PR was sometime in 2019 though…but yes, conceptually this is one solution.

Topic		Replies	Views
Vector{Vector} indices General Usage indexing , arrays	22	2795	September 19, 2022
Unexpected broadcasting behaviour (?) General Usage	2	69	January 11, 2025
Why x[[1,2]] = [0,0] works? General Usage question	2	634	December 13, 2018
Indexing 1-Dimentional Arrays General Usage question , array , type	4	2977	November 3, 2019
Apparently `getindex` of a vector accepts more than one index?! General Usage indexing , array , arrays	1	184	April 21, 2024

Why is the following behaviour not a bug?

Related topics