One-sided ranges in array indexes

Suppose I have an array a = [1, 2, 3, 4, 5].

I can index into it at a single value: a[2] == 2. I can index it with a range: a[2:4] == [2, 3, 4]. I can even get the whole array back with a[:] == a.
Where this pattern breaks down is when writing a one-sided range: a[3:] != a[3:5] == [3, 4, 5].

Yes, we have end for this purpose, and as of v1.4, begin for ranges starting at the beginning of an index.

But is there a technical reason why syntax like a[3:] can’t be translated to a[3:end], yielding [3, 4, 5]? Or similarly, why a[:3] can’t become a[begin:3] == [1, 2, 3]? This feels cleaner, and is similar to Python slice syntax.

This syntax would even more useful in multidimensional arrays, while remaining unambiguous (as far as I can tell). So instead of b[3:end, begin:4], you’d have b[3:, :4].

Currently, a[3:] yields a range syntax error:
ERROR: syntax: missing last argument in "3:" range expression

However, a[:3] is interpreted as a[3]. While this is valid code, I don’t believe that anyone would intentionally write this expecting to get a[3].

Would it be possible to integrate this syntax into Julia without breaking any existing code or adding too much parse logic?

Yes, however, ranges also work with variables so if you have two integers called a and b then a:b is a range from a to b, however, :b is Symbol("b"). How you would deal with that?

4 Likes

Perhaps because you have a python background? To my, a[:3] looks unbalanced.

What if you replace 3 with

julia> a = [1,2,3,4,5];

julia> i = 3;

julia> a[:i]
ERROR: ArgumentError: invalid index: :i of type Symbol

:i is a symbol.

Given that writing expressions, characters like i are symbols while numbers like 3 are literal integers:

julia> dump(:((i,3)))
Expr
  head: Symbol tuple
  args: Array{Any}((2,))
    1: Symbol i
    2: Int64 3

I’d expect to get the symbol i and integer 3 from :i and :3, respectively.

Also, a[:i] is not something you can assume was unintentional, like you might assume for :3.

julia> struct Foo end

julia> Base.getindex(::Foo, x) = x

julia> a = Foo();

julia> a[:i]
:i
4 Likes

Is it possible you could use :: for this?

julia> :(A[3::])
ERROR: syntax: unexpected "]"
Stacktrace:
 [1] top-level scope at none:1

julia> :(A[::3]) |> dump
Expr
  head: Symbol ref
  args: Array{Any}((2,))
    1: Symbol A
    2: Expr
      head: Symbol ::
      args: Array{Any}((1,))
        1: Int64 3

I think it’s always an error if nothing comes before. But perhaps it’s too weird?

2 Likes

It may be unbalanced, but it is easy to type and easy to read.

As for symbol indices, that’s a valid point. The syntax a[3:] may remain unambiguous, and be more frequently needed, but it’s probably too confusing allowing one-sided ranges only in one direction.

While in theory this could work, I think it loses the key advantages of symmetry and discoverability that a single colon would have.

Actually, assuming all indices along a given dimension have to be of the same type, shouldn’t it be unambiguous which is meant? For instance, if a given dimension is indexed by symbols, the existing behavior would be retained, whereas if it were indexed by integers, the proposed one-sided range would remain unambiguous.

That’s just not how Julia’s parser works. Whether something is a symbol or a range expression needs to be decided at parse time, where no information about types is available. This would also be fairly fragile — an example where this breaks down is DataFrames.jl, where DataFrames can be indexed with symbols as well as ranges. I would also suggest reading this post by Stefan, since this behavior is now fairly established in Julia and found to work quite well, so it is very unlikely this is going to change.

6 Likes

I’m certain something like this (not with : but maybe :: as suggested) could be made possible, however,

I strongly disagree (I think that’s the python talking :wink: ). Reading
b[3:, :5] or b[:3, 5:]
practically makes me dizzy!

3 Likes

Note that the a[:x] syntax would have to work not only for literals like 3, but general expressions, as it does currently. There you would run into breaking changes as : currently quotes the expression. Cf

5 Likes

In my opinion, the further we are away from Python’s array slicing syntax, the better.

5 Likes

This also seems prone to silent errors. Suppose I really meant to write mean(a[1:5]) and instead wrote mean(a[1:]). Right now this errors, in your proposal it happily runs as mean(a[1:end])…undesirable IMO.

4 Likes

It’s a matter of personal preference. In Julia readability seems to be generally preferred over codegame-like syntax.

Just compare

  • a[1:5] vs a[:5]

  • a[5:end] vs a[5:]

and you will easily see which one can be understood at a glance.

I think this also goes well with 1-based indexing and end instead of -1.

8 Likes