the `range` improvement for 1.1

Hi all, first time posting. Apologies anywhere my understanding is off, and please forgive that I am not a great communicator in this domain. I’ve nonetheless invested some time to write the most careful and concise observation I can.

Looking over PR #25896 and subsequently #28708, the latter providing the new behavior for 1.1, I was struck by the opportunity to further this improvement, and seemingly cap off the issue in a more logically complete and longstanding resolution. Seemingly, to me.

You’d already put this one behind you, I believe. :ok_hand:

The new Base.range as stylized in the documentation is

range(start[, stop]; length, stop, step=1)

with some constraints regarding which of the arguments should be specified at a time.

This bracket stylization has its limitations, so for the moment I’ll rewrite it as two methods, still stylized as documentation (wherever such lists multiple methods) and not as the declarations in range.jl.

range(start; stop, length, step=1)
range(start, stop; length, step=1)

Now, I’m aware the second of these has been given an extra constraint in #28708, making the default =1 effectively disappear; but for the moment let’s forget this, and instead consider how these two unchanged naturally extend to four:

range(; start, stop, length, step=1)
range(start; stop, length, step=1)
range(start, stop; length, step=1)
range(start, stop, length)

Humoring these now to be the four methods of range, I suppose the natural documentation would be

range([start, stop, length]; start, stop, length, step=1)

Note such repetitive positional-and-keyword argument idiom seems inevitable, for such a “specify any 3 arguments” kind of function that we don’t build entirely of keyword arguments.

Now, what do we observe about the two additional methods?

First, the no-positional-arguments method adds functionality. Namely that start needn’t be used. e.g. range(stop=100,length=3) specifies the 3 integers leading up to 100. Which would otherwise be constructed through manipulation, reverse(range(100,length=3,step=-1)). The benefit I suppose is natural specification without thought, whatever that’s worth.

Second, the three-positional-arguments method appears even more natural and obvious an extension than was the addition of the two-positional-arguments method previously, once that has been added. Needless to say, this would throw a real Christmas gift to the linspace lamenters and likeminded future users. Ironically, range(a,b,n) is even shorter than linspace(a,b,n). But more importantly, highly logical and folded in.

Although, perhaps we should hesitate because the pre-0.7 range took a different 3 positional arguments, and optionally a different 2. That would be the biggest reason for pause. I’d imagine to focus on the longer term, though.

Next, perhaps of equal value to any gains noted so far is the other thing, that this extension by 2 methods may also lead to the simplest to describe range.

(For context, it seems to me the 1.0 & current 1.1 manual entries are somewhat awkward to digest; I did read them up and down several times to understand precisely the argument possibilities. It was not even mentioned that this function is fundamentally one of specifying any 3 of 4 given characteristics, or any 2 of 3 if start is required.)

So now I’ll take a moment to have a go at writing what a new manual entry might look like. Forgive me, for I’ll also increase it informationally in ways unrelated to this discussion.


Base.rangeFunction.

range([start, stop, length]; start, stop, length, step=1)

Construct a range, i.e. the compact representation of a linearly spaced sequence of values, from any 3 of: start, stop, length, and step.

start is the first element; stop is the last element (or its bound, if start, stop, and step are used); length is the number of elements; and step is the spacing of elements.

If all 4 are specified, they must agree. If only 2 are specified, neither being step, then step defaults to 1 to provide the third. Otherwise precisely 3 must be specified.

start, stop, and length each may be specified as either keyword or positional argument, the latter subject to all preceding positional arguments being present, of course.

The returned range is a subtype of AbstractRange: a UnitRange, StepRange, StepRangeLen, or LinRange depending on the arguments.

range(a,b,step=s) is equivalent to a:s:b, and range(a,b) equivalent to a:b. See (:).

! Julia 1.1
Prior to Julia 1.1, the calling syntax is limited to range(start; stop, length, step=1).

Examples


That would be my attempt at a manual entry, followed then by plentiful examples.

Finally, getting back to the design decision making the default =1 disappear in the two-positional-arguments method, which I presumed not to be the case in the reasoning above.

This special handling was proposed and liked in #28708, notably by @jeff.bezanson and @StefanKarpinski. To paraphrase, the idea was to go with this handling “for now” as the conservative way to leave options open for the future and for possible current clarity.

One thing problematic, however, as noted by @martinholters (who nonetheless accepted it), is that this handling contradicts the documentation. Not only contradicts the current 1.1 documentation, but indeed seems difficult to reconcile with the manual entry’s

range(start[, stop]; length, stop, step=1)

or

range([start, stop, length]; start, stop, length, step=1)

(And if it is explained therein, adds additional complexity.)

If I understand #28708 correctly, the option left open for the future is making the two-positional-arguments method have default length rather than step. (Was there any other eventuality?) The convenience gain would be to specify things like range(0,10) instead of the longer range(0,10,length=100) for common usages such as plotting, if 100 were defaulted.

On the other hand, one alleviation brought about by the syntax of this post is that this is no longer so lengthy: range(0,10,100).

(Perhaps some might also appreciate maintaining the visibility of 100 here, it being a somewhat arbitrary magic number. In contrast, as @jeff.bezanson noted, the default 1 of step is arguably not so magical.)

Regarding the other cited benefit, more calling clarity, there is one thing I did not understand in #28708 if @jeff.bezanson could explain.

You noted that ensuring every call of range to have a keyword argument guarantees some level of clarity at the call site, to paraphrase. I suppose you mean that clarity is negatively related to how many non-keyword arguments there are, each of which is intelligible only through positional inference, rather than meaning there is a particular difference in having at least one keyword. Did I get you? Your clarity :stuck_out_tongue_winking_eye: is appreciated, thank you.

1 Like

I’m trying to find the call to action here. Can you boil this down a bit? If you have an improvement in mind for the documentation, the best way to get it reviewed is to submit a PR to the source.

If I understand #28708 correctly, the option left open for the future is making the two-positional-arguments method have default length rather than step . (Was there any other eventuality?) The convenience gain would be to specify things like range(0,10) instead of the longer range(0,10,length=100) for common usages such as plotting, if 100 were defaulted.

There was a marked lack of enthusiasm when I brought this up. One could argue for a two-argument LinRange constructor with a default length, but that’s still going to be an uphill battle.

Yes; I don’t think we should have range(a,b,c) because it’s too hard to guess what the arguments mean. Making it worse is that we have a:b:c where the order is start,step,stop, so range(a,b,c) using the order start,stop,length would be very confusing IMO.

Yes, that seems like a reasonable extension to me. :+1:

5 Likes

range(stop=100) could even return a OneTo for efficiency.

3 Likes

I suspect most uses of range with length have either start and stop as floats, or have stop-start lower than length, so I don’t think it would be that confusing to allow range(a,b,length) in practice. Converting linspace to range was annoying and resulted in less concise code. The situation is much better now with stop being positional, but it’d be nice to get rid of those length arguments. That does go against python that uses range(start,stop,step), but it’s one of those things that varies a lot from language to language (ie in fortran it’s do i = a, b, step) so you have to read the docs to see what it does anyway.

You can already get this behavior via LinRange:

julia> LinRange(1, 2, 5)
5-element LinRange{Float64}:
 1.0,1.25,1.5,1.75,2.0

It’d be nice to have a two-argument constructor with a default length of 100 for interactive use:

julia> LinRange(0, π)
100-element LinRange{Float64}:
 0.0,0.0317333,0.0634665,0.0951998,0.126933,…,3.04639,3.07813,3.10986,3.14159
1 Like

I did not know that, thanks!

Many books on coding style discourage 3+ positional arguments because they are difficult to remember, require a lookup in the documentation, and thus can mask bugs.

Base and the standard libraries have been moving in this direction too, moving extra positional arguments into keywords. A nice example is

1 Like

Yes, there has been a global push towards keyword arguments in julia 1.0. This has been great for consistency, but the likes of range(0,stop=1,length=100) and dropdims(sum(a,dims=1),dims=1) are not that nice to write or read.

There is a tension between usability/concision and consistency/purity (think matlab vs scheme); I feel that the balance has shifted towards the latter in several decisions, as Julia distances itself from being a language for technical computing.

1 Like

Perhaps you missed

There is a tension between usability/concision and consistency/purity (think matlab vs scheme); I feel that the balance has shifted towards the latter in several decisions, as Julia distances itself from being a language for technical computing.

I think (and hope) this shift is also largely due to the non-breaking constraint of the 1.0 era. In my view, the mindset was to better make things clean and precise for 1.0, even at the expense of verbosity. There were a lot of “we can decide/add that later” statements since adding convenience functionality typically isn’t breaking. Therefore I have hope that we’ll get (back) convenience methods in the future.

dropdims(sum(a,dims=1),dims=1)

Yeah, I also have a bunch of those overly verbose expressions (dims occurs three times!) in my code as well now.

1 Like

9 posts were split to a new topic: The verbosity of dropdims…dims…dims (was: range improvement for 1.1)