Why are the fields of UnitRange
restricted to be <: Real
? Using that type could make sense for ranges of other types which are represented by integers under the hood (eg Date
uses StepRange
with Day(1)
, even though it could use UnitRange
). This would be useful since UnitRange
has a lot of operations defined generically that StepRange
does not (eg intersect
, etc).
My suspicion is that itâs because UnitRange
implies/assumes that the step is 1
, which only makes sense if the type is real.
We do currently define step(r::AbstractUnitRange) = 1
, but we could just as easily define step(r::AbstractUnitRange{T}) = oneunit(T)
⌠seems like an improvement to me.
In olden days, we only had the multiplicative identity one(T)
, which would have made less sense for step
, but now that some dashing programmer has added oneunit
, maybe it is time to reconsider.
If it isnât going to use oneunit
it should be renamed to OneRange
.
Thanks. I found that for my particular use case,
https://github.com/JuliaMath/IntervalSets.jl
does everything out of the box, so it can substitute for UnitRange
.
step(r::AbstractUnitRange{T}) where T = T(1)
seems safe. I can no longer remember why I picked 1 rather than T(1)
.
In contrast, step(r::AbstractUnitRange{T}) = oneunit(T)
seems to open the door to 1m:10m
. As Iâve argued elsewhere, I donât think thatâs a well-defined concept: we all believe that 10m == 1000cm
, so why should length(1m:10m) == 10
while length(100cm:1000cm) == 991
? (If you want to make a unitful range, physics seems to demand that the user supplies the step as an argument.) The definition step(r::AbstractUnitRange) = 1
is consistent with guarding against such problems.
In contrast, 1m..10m
(as defined in the IntervalSets package) is perfectly well-defined, since there is no implied step.
Because it is a âunit rangeâ and you specified the units?
Because it is a âunit rangeâ and you specified the units?
Thatâs just a naming thing. Here weâre using unit
in the sense of 1, and I know youâd agree that 1m != 1
. We could change the name if you think that would help.
This argument seems a bit circular. Yes, I know thatâs the current behavior. But I like the name UnitRange
and wish we adhered to it more literally.
I honestly donât see any problem with saying that a:b
is a âunitâ range in steps of oneunit
for the endpoints (promoted to a common type via promote
).
I think that would be useful, strictly generalizes the current behavior, and conforms to most peopleâs expectations of units, e.g. what they would expect for Day(1):Day(10)
or 1m:100m
if you asked them.
I agree that it seems natural when you are literally typing the characters in the console. But there are many cases where initial impressions lead you down the wrong path. Consider the impact of promotion alone:
julia> UInt8(0x01):Int16(5) # what would 1mm:2ft do?
1:5
julia> a, b = 1, 5
(1, 5)
julia> shift = 0.1
0.1
julia> (a:b) .+ shift == collect(a+shift:b+shift)
true
# Now try this with a, b = 1m, 5m and shift = 1mm.
# (a:b) .+ shift would have length 5
# a+shift:b+shift might have length 4001
I was trying to think of cases in julia where a == x
and b == y
and yet op(a, b) != op(x, y)
. (E.g., a
and b
in meters and x
and y
in millimeters, and op
is UnitRange
.) I donât doubt that we can do that (thanks to dispatch), but I think we try to avoid it in general. For example, I think youâd be pretty unhappy if 2*x != 2.0*x
. But thatâs precisely the kind of behavior youâre asking for here. In places where we elevate the type above the concept (e.g., 1:5 != collect(1:5)
), people tend to get pretty unhappy.
Asking the user to specify the step is not exactly a lot of work. Really, itâs the only option given that physical units define an equivalence class (thatâs their core mathematical property): under such circumstances, there is no such thing as 1, and it would be dangerous to pretend otherwise.
All this seems pretty far from the OP. I agree it doesnât have to be Real, but the example of Dates is precisely the kind of behavior we donât want to enable.
I think that the following would be sensible behavior:
a:b
is parsed ascolon(a, b)
, which dispatches toUnitRange
in general (notStepRange
like it does now),UnitRange(::T, ::T)
checks some trait (eghas_unit_stepsize(T)
), if that isfalse
(the default), it throws an error. It should betrue
for<: Integer
,Date
, and similar user-defined types which have a ânaturalâ stepsize.UnitRange(::S, ::T)
promotes the arguments to a common type, calling the previous method.
So promote(1mm, 2ft)
would either be undefined (I am unsure which package the example is for), or the UnitRange
constructor would fail because it does not have a unit stepsize. 1m:3m
would give something equivalent to [1m,2m,3m]
.
Iâm not saying we couldnât define those methods, Iâm saying we shouldnât. The word ânaturalâ is really scary: if I use a start of DateTime(2017, 10, 1)
and a stop of DateTime(2017, 12, 1)
, whatâs natural? A step of a day or a month? (Both those dates are the first days of their respective months.) Remember that a number with physical units corresponds to some external reality independent of how you choose to describe that reality. If I tell you that I marked out a playing field by drawing lines of spacing 1 between my mailbox and my ditch, even if you know my yard you have no clue how many lines I drew. In contrast, if I say â1 fathomâ then you know, and if youâd personally rather calculate in meters you can convert everything and come to the same answer I would while working in fathoms. You get the same answer independent of representation: thatâs the entire point of physical units.
UnitRange(::S, ::T) promotes the arguments to a common type, calling the previous method.
That was the point of my example: if promote(1mm, 2ft)
promoted to ft
, youâd get a range of length 2; if it promoted to mm
, youâd get a range of length 609. Which one is ânaturalâ? Your only defense is not to define promotion, but then that would mean that you canât insert 10mm
into a Vector{Meter}
, which doesnât make sense either.
If you allow 1 to be equivalent to 1m
then you come to some pretty strange conclusions, like 1s == 1day
(because of convert(Dates.Second, convert(Int, convert(Dates.Day(1)))
, see Dates defines nonsensical conversions ¡ Issue #19896 ¡ JuliaLang/julia ¡ GitHub). I donât think anyone thinks that makes sense. But this isnât an artificial example: you could hit it easily simply by trying to store values in arrays (since setindex!
calls convert
).
Until your post I didnât fully realize that we currently support a:b
for Dates. Yikes. By comparison, in a very well thought-out package for physical Units:
julia> using Unitful: s
julia> 1s:10s
ERROR: DimensionError: s and 1 are not dimensionally compatible.
Stacktrace:
[1] colon(::Quantity{Int64, Dimensions:{đ}, Units:{s}}, ::Quantity{Int64, Dimensions:{đ}, Units:{s}}) at ./range.jl:9
julia> 1s:1s:10s
1 s:1 s:10 s
The extra effort to specify the range concretely is tiny in comparison to breaking the distributive property for ranges (see my shift
example above), and tiny even in comparison to checking the documentation to see what someone has arbitrarily decided that ânaturalâ means.
Let me rephrase it then: if
- a concrete type
T
- can take only discrete values
- which can be mapped to an contiguous subset of integers with some affine transformation f (eg identity for integers, rata die for
Date
s) ,
then let UnitRange(x::T,y::T)
denote the set of all possible values between x and y, inclusive. This is what I meant by ânaturalâ.
Letâs go through the examples:
DateTime(2017, 10, 1):DateTime(2017, 12, 1)
would represent all nanoseconds between these two dates.promote(1mm, 1ft)
is(1//1000 m, 381//1250 m)
, does not map to<: Integer
,UnitRange
should throw an error. The user should useStepRange
.1s
isQuantity{Int64, Dimensions:{đ}, Units:{s}}
, which hasInt64
as the underlying representation.10s
similarly. So1s:10s
is OK, equivalent to[1s,2s,...,10s]
.
So then length(DateTime(2017, 10, 1):DateTime(2017, 12, 1)) == 5270400001
, right?
Note again that DateTime(2017, 10, 1)..DateTime(2017, 12, 1)
(representing an interval from IntervalSets) is a much better way of saying âall times between those two datesâ, because an interval doesnât imply a step. By not implying something youâre not controlling, you sidestep all the concerns I have raised here and I have no objections of any kind. Thatâs why intervals are such a fundamental type.
promote(1mm, 1ft) is (1//1000 m, 381//1250 m), does not map to <: Integer, UnitRange should throw an error. The user should use StepRange.
How do you feel about this:
julia> 1//3 : 4//3
1//3:4//3
1s
isQuantity{Int64, Dimensions:{đ}, Units:{s}}
, which hasInt64
as the underlying representation.10s
similarly. So1s:10s
is OK, equivalent to[1s,2s,...,10s]
We have
julia> 1.1:10
1.1:1.0:9.1
and it returns a âStepRangeâ (really a StepRangeLen
), not a UnitRange
. Making distinctions based on floating-point vs integer but being âsloppyâ about dimensionless and unitful seems backwards to me. If Iâm grading a physics test, Iâll give full credit to both 3.2m
and 32//10 m
. I wonât give full credit to 3.2
.
Yes. Is there a problem with this?
Since Rational{Int64}
cannot be mapped to integers using an affine mapping, this should signal an error. I realize that my proposal would break existing code. This should be fixed by making the stepsize explicit.
It is possible that you misunderstand me or I was not clear. I donât think I advocated being sloppy about units. My proposal above only concerns units when arguments with different types are promoted to a common type which maps to integers, thatâs where the âsloppinessâ could come in.
I do agree that IntervalSets
is a great way to work around some of the problems. I am perfectly fine with submitting PRâs to that package to obtain some behavior (I may also need iterators etc). But I also think that my proposal above is consistent (and of course breaking).
If I understand correctly (please correct me if I am wrong), you would prefer a:b
to mean
a:d:b
, wheneverd
as1
makes sense; when in doubt,a:b
should not be defined.
OTOH I want a:b
to mean
all possible values between
a
andb
, inclusive, whenever that makes sense (as I described above); when not,a:b
should not be defined.
I think both can be made to work, but not at the same time. The intersection is pretty much a:b
defined for integers.
all possible values between a and b, inclusive, whenever that makes sense (as I described above); when not, a:b should not be defined.
Thatâs a great concept (itâs an Interval), but Julia has long used colon
to create an AbstractRange
which means a container of discrete values. Itâs not so useful to have length(1.0f0:1.0001f0) == 840
simply because there are 840 Float32
s between those two numbers. Nor is length(1.0f0..1.0001f0)
a particularly useful concept (it depends entirely on the choice of how many mantissa/exponent bits there are, and in general I donât think we want to allow such things to lead to dramatic differences).
I am afraid you are ignoring an important part of my proposal: the requirement of an affine mapping to integers (which defines the ânaturalâ stepsize). The above would throw an error under my proposal.
Almost. I want a finite collection of all (equally spaced) values. In a sense, the combination of UnitRange
and ClosedInterval
. ClosedInterval
can be made to work with this, by defining methods for some types. But not, of course, for <: AbstractFloat
and similar. If I make PRs for IntervalSets
(as I recently did), it would lead to a situation where some methods (eg iteration) work for some subtypes for ClosedInterval
, but not for others. Would you be OK with this?
In any case, Discourse is now warning me that I am talking to you too much Thank you for taking the time to discuss, I will keep working on the actual data analysis problem that motivated this whole topic for me (insurance spells, delimited by dates, I need to check spells for overlap, intersections, etc) and will see what API I would need to make that easier.
Goodness, I didnât know it did that. In replying to this Iâm getting the same warning. For the record, I donât think 3 messages proposing interesting design ideas about an important topic is too much .
Anyway, thanks for reminding me about your intended limitation on :
. We can discuss iteration over elements of ClosedInterval
in IntervalSets.