'Negative' UnitRange in Julia?

DorianT · December 9, 2021, 10:21pm

Today I came across some behaviour regarding UnitRanges that I did not quiet understand. I was hoping to find some enlightenment here

The issues boils down to this behavior, below is some more context:

julia> 4:8
4:8

julia> 8:4
8:7

julia> 8:4 |> typeof
UnitRange{Int64}

@code_lowered and @code_typed also produced similar output for both. However:

j = 1:2:20 |> collect

julia> j[4:8]
5-element Vector{Int64}:
  7
  9
 11
 13
 15

julia> j[8:4]
Int64[]

But most confusing to me is that:

julia> j[8:-1:4]
5-element Vector{Int64}:
 15
 13
 11
  9
  7

Produces just the behavior I would expect?

In R this behavior just works fine, I don’t know if it is discouraged in Julia for some reason?

r = seq(1,20,2)
r[8:4]
[1] 15 13 11  9  7

PS: I came across this behaviour trying to index a matrix using 2 ranges to get a sub matrix and sometimes the ranges can be ‘negative’ (i.e. starting with the larger number first).

My questions are: Why is Julia currently behaving like this? I especially don’t understand the output of negative ranges always being start:start-1. Is there a better way to do this in base or is there a package that can be used to enable this kind of behaviour?

Edit: Typos

stillyslalom · December 9, 2021, 10:48pm

(a::Int):(b::Int) creates a UnitRange, which has a step of +1 by definition:

help?> UnitRange
search: UnitRange AbstractUnitRange

  UnitRange{T<:Real}

  A range parameterized by a start and stop of type T, filled with elements
  spaced by 1 from start until stop is exceeded. The syntax a:b with a and b
  both Integers creates a UnitRange.

The negative range endpoint coercion allows the length of a UnitRange to always be equal to stop - start + 1, which avoids an unnecessary branch in subsequent calculations of length. If you want to create StepRanges with a unit step of the proper sign, you could do something like this:

julia> ..(a, b) = a:sign(b - a):b
.. (generic function with 1 method)

julia> 4..8
4:1:8

julia> 8..4
8:-1:4

**edit: this is more robust (works for a == b):

julia> ..(a, b) = a:sign(b - a + (a == b)):b
.. (generic function with 1 method)

julia> 2..2
2:1:2

DorianT · December 9, 2021, 11:11pm

The negative range endpoint coercion allows the length of a UnitRange to always be equal to stop - start + 1 , which avoids an unnecessary branch in subsequent calculations of length .

So this refers to the compiler having to do less work with the current implementation?

Thanks for the nice explanation though. Your proposed solution is great, I will implement it right away.

I am still a bit curious as to why for ranges start:stop with start < stop currently a range of start:start-1 is returned? Is there a particular use case for this? Would an error not more desirable, because otherwise silently an empty collection is returned which may lead to unintended behaviour?

jling · December 10, 2021, 1:27am

yes, so when you do a loop over ends points, you get the correct behavior (empty, 0-length, skip it) without manually worrying about which number is larger.

Julia implemements short cuts (specialized methods) that make sense, for example:

even if the range is not “Unit” anymore, there’s still a shortcut.

mkitti · December 10, 2021, 2:33am

The more general way of creating a range is via range. There you can also specify length or step:

julia> r_length = range(start=3, stop=-6, length=10)
3.0:-1.0:-6.0

julia> typeof(r_length)
StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}

julia> collect(r_length)
10-element Vector{Float64}:
  3.0
  2.0
  1.0
  0.0
 -1.0
 -2.0
 -3.0
 -4.0
 -5.0
 -6.0

julia> r_step = range(start=3, stop=-6, step=-1)
3:-1:-6

julia> typeof(r_step)
StepRange{Int64, Int64}

julia> collect(r_step)
10-element Vector{Int64}:
  3
  2
  1
  0
 -1
 -2
 -3
 -4
 -5
 -6

DNF · December 10, 2021, 10:15am

There’s also the issue of type stability, which is so central in Julia (and in particular for a commonly used construct like UnitRange). The step length of +1 is encoded in the type itself. If stop<start caused the step to become -1, it would no longer be a UnitRange, and when output types depend on values, you have a type instability.

BTW, in contrast to R, this is Matlab:

>> 5:3
ans =
  1×0 empty double row vector

DorianT · December 10, 2021, 11:01am

Thanks for the further explanations! I also had a look at the issues and found https://github.com/JuliaLang/julia/issues/40331, which probably helps others not get confused like I have