Why are the fields of UnitRange restricted to be <: Real? Using that type could make sense for ranges of other types which are represented by integers under the hood (eg Date uses StepRange with Day(1), even though it could use UnitRange). This would be useful since UnitRange has a lot of operations defined generically that StepRange does not (eg intersect, etc).
My suspicion is that itās because UnitRange implies/assumes that the step is 1, which only makes sense if the type is real.
We do currently define step(r::AbstractUnitRange) = 1, but we could just as easily define step(r::AbstractUnitRange{T}) = oneunit(T) ⦠seems like an improvement to me.
In olden days, we only had the multiplicative identity one(T), which would have made less sense for step, but now that some dashing programmer has added oneunit, maybe it is time to reconsider.
If it isnāt going to use oneunit it should be renamed to OneRange.![]()
Thanks. I found that for my particular use case,
https://github.com/JuliaMath/IntervalSets.jl
does everything out of the box, so it can substitute for UnitRange.
step(r::AbstractUnitRange{T}) where T = T(1) seems safe. I can no longer remember why I picked 1 rather than T(1).
In contrast, step(r::AbstractUnitRange{T}) = oneunit(T) seems to open the door to 1m:10m. As Iāve argued elsewhere, I donāt think thatās a well-defined concept: we all believe that 10m == 1000cm, so why should length(1m:10m) == 10 while length(100cm:1000cm) == 991? (If you want to make a unitful range, physics seems to demand that the user supplies the step as an argument.) The definition step(r::AbstractUnitRange) = 1 is consistent with guarding against such problems.
In contrast, 1m..10m (as defined in the IntervalSets package) is perfectly well-defined, since there is no implied step.
Because it is a āunit rangeā and you specified the units?
Because it is a āunit rangeā and you specified the units?
Thatās just a naming thing. Here weāre using unit in the sense of 1, and I know youād agree that 1m != 1. We could change the name if you think that would help.
This argument seems a bit circular. Yes, I know thatās the current behavior. But I like the name UnitRange and wish we adhered to it more literally.
I honestly donāt see any problem with saying that a:b is a āunitā range in steps of oneunit for the endpoints (promoted to a common type via promote).
I think that would be useful, strictly generalizes the current behavior, and conforms to most peopleās expectations of units, e.g. what they would expect for Day(1):Day(10) or 1m:100m if you asked them.
I agree that it seems natural when you are literally typing the characters in the console. But there are many cases where initial impressions lead you down the wrong path. Consider the impact of promotion alone:
julia> UInt8(0x01):Int16(5) # what would 1mm:2ft do?
1:5
julia> a, b = 1, 5
(1, 5)
julia> shift = 0.1
0.1
julia> (a:b) .+ shift == collect(a+shift:b+shift)
true
# Now try this with a, b = 1m, 5m and shift = 1mm.
# (a:b) .+ shift would have length 5
# a+shift:b+shift might have length 4001
I was trying to think of cases in julia where a == x and b == y and yet op(a, b) != op(x, y). (E.g., a and b in meters and x and y in millimeters, and op is UnitRange.) I donāt doubt that we can do that (thanks to dispatch), but I think we try to avoid it in general. For example, I think youād be pretty unhappy if 2*x != 2.0*x. But thatās precisely the kind of behavior youāre asking for here. In places where we elevate the type above the concept (e.g., 1:5 != collect(1:5)), people tend to get pretty unhappy.
Asking the user to specify the step is not exactly a lot of work. Really, itās the only option given that physical units define an equivalence class (thatās their core mathematical property): under such circumstances, there is no such thing as 1, and it would be dangerous to pretend otherwise.
All this seems pretty far from the OP. I agree it doesnāt have to be Real, but the example of Dates is precisely the kind of behavior we donāt want to enable.
I think that the following would be sensible behavior:
a:bis parsed ascolon(a, b), which dispatches toUnitRangein general (notStepRangelike it does now),UnitRange(::T, ::T)checks some trait (eghas_unit_stepsize(T)), if that isfalse(the default), it throws an error. It should betruefor<: Integer,Date, and similar user-defined types which have a ānaturalā stepsize.UnitRange(::S, ::T)promotes the arguments to a common type, calling the previous method.
So promote(1mm, 2ft) would either be undefined (I am unsure which package the example is for), or the UnitRange constructor would fail because it does not have a unit stepsize. 1m:3m would give something equivalent to [1m,2m,3m].
Iām not saying we couldnāt define those methods, Iām saying we shouldnāt. The word ānaturalā is really scary: if I use a start of DateTime(2017, 10, 1) and a stop of DateTime(2017, 12, 1), whatās natural? A step of a day or a month? (Both those dates are the first days of their respective months.) Remember that a number with physical units corresponds to some external reality independent of how you choose to describe that reality. If I tell you that I marked out a playing field by drawing lines of spacing 1 between my mailbox and my ditch, even if you know my yard you have no clue how many lines I drew. In contrast, if I say ā1 fathomā then you know, and if youād personally rather calculate in meters you can convert everything and come to the same answer I would while working in fathoms. You get the same answer independent of representation: thatās the entire point of physical units.
UnitRange(::S, ::T) promotes the arguments to a common type, calling the previous method.
That was the point of my example: if promote(1mm, 2ft) promoted to ft, youād get a range of length 2; if it promoted to mm, youād get a range of length 609. Which one is ānaturalā? Your only defense is not to define promotion, but then that would mean that you canāt insert 10mm into a Vector{Meter}, which doesnāt make sense either.
If you allow 1 to be equivalent to 1m then you come to some pretty strange conclusions, like 1s == 1day (because of convert(Dates.Second, convert(Int, convert(Dates.Day(1))), see Dates defines nonsensical conversions Ā· Issue #19896 Ā· JuliaLang/julia Ā· GitHub). I donāt think anyone thinks that makes sense. But this isnāt an artificial example: you could hit it easily simply by trying to store values in arrays (since setindex! calls convert).
Until your post I didnāt fully realize that we currently support a:b for Dates. Yikes. By comparison, in a very well thought-out package for physical Units:
julia> using Unitful: s
julia> 1s:10s
ERROR: DimensionError: s and 1 are not dimensionally compatible.
Stacktrace:
[1] colon(::Quantity{Int64, Dimensions:{š}, Units:{s}}, ::Quantity{Int64, Dimensions:{š}, Units:{s}}) at ./range.jl:9
julia> 1s:1s:10s
1 s:1 s:10 s
The extra effort to specify the range concretely is tiny in comparison to breaking the distributive property for ranges (see my shift example above), and tiny even in comparison to checking the documentation to see what someone has arbitrarily decided that ānaturalā means.
Let me rephrase it then: if
- a concrete type
T - can take only discrete values
- which can be mapped to an contiguous subset of integers with some affine transformation f (eg identity for integers, rata die for
Dates) ,
then let UnitRange(x::T,y::T) denote the set of all possible values between x and y, inclusive. This is what I meant by ānaturalā.
Letās go through the examples:
DateTime(2017, 10, 1):DateTime(2017, 12, 1)would represent all nanoseconds between these two dates.promote(1mm, 1ft)is(1//1000 m, 381//1250 m), does not map to<: Integer,UnitRangeshould throw an error. The user should useStepRange.1sisQuantity{Int64, Dimensions:{š}, Units:{s}}, which hasInt64as the underlying representation.10ssimilarly. So1s:10sis OK, equivalent to[1s,2s,...,10s].
So then length(DateTime(2017, 10, 1):DateTime(2017, 12, 1)) == 5270400001, right?
Note again that DateTime(2017, 10, 1)..DateTime(2017, 12, 1) (representing an interval from IntervalSets) is a much better way of saying āall times between those two datesā, because an interval doesnāt imply a step. By not implying something youāre not controlling, you sidestep all the concerns I have raised here and I have no objections of any kind. Thatās why intervals are such a fundamental type.
promote(1mm, 1ft) is (1//1000 m, 381//1250 m), does not map to <: Integer, UnitRange should throw an error. The user should use StepRange.
How do you feel about this:
julia> 1//3 : 4//3
1//3:4//3
1sisQuantity{Int64, Dimensions:{š}, Units:{s}}, which hasInt64as the underlying representation.10ssimilarly. So1s:10sis OK, equivalent to[1s,2s,...,10s]
We have
julia> 1.1:10
1.1:1.0:9.1
and it returns a āStepRangeā (really a StepRangeLen), not a UnitRange. Making distinctions based on floating-point vs integer but being āsloppyā about dimensionless and unitful seems backwards to me. If Iām grading a physics test, Iāll give full credit to both 3.2m and 32//10 m. I wonāt give full credit to 3.2.
Yes. Is there a problem with this?
Since Rational{Int64} cannot be mapped to integers using an affine mapping, this should signal an error. I realize that my proposal would break existing code. This should be fixed by making the stepsize explicit.
It is possible that you misunderstand me or I was not clear. I donāt think I advocated being sloppy about units. My proposal above only concerns units when arguments with different types are promoted to a common type which maps to integers, thatās where the āsloppinessā could come in.
I do agree that IntervalSets is a great way to work around some of the problems. I am perfectly fine with submitting PRās to that package to obtain some behavior (I may also need iterators etc). But I also think that my proposal above is consistent (and of course breaking).
If I understand correctly (please correct me if I am wrong), you would prefer a:b to mean
a:d:b, wheneverdas1makes sense; when in doubt,a:bshould not be defined.
OTOH I want a:b to mean
all possible values between
aandb, inclusive, whenever that makes sense (as I described above); when not,a:bshould not be defined.
I think both can be made to work, but not at the same time. The intersection is pretty much a:b defined for integers.
all possible values between a and b, inclusive, whenever that makes sense (as I described above); when not, a:b should not be defined.
Thatās a great concept (itās an Interval), but Julia has long used colon to create an AbstractRange which means a container of discrete values. Itās not so useful to have length(1.0f0:1.0001f0) == 840 simply because there are 840 Float32s between those two numbers. Nor is length(1.0f0..1.0001f0) a particularly useful concept (it depends entirely on the choice of how many mantissa/exponent bits there are, and in general I donāt think we want to allow such things to lead to dramatic differences).
I am afraid you are ignoring an important part of my proposal: the requirement of an affine mapping to integers (which defines the ānaturalā stepsize). The above would throw an error under my proposal.
Almost. I want a finite collection of all (equally spaced) values. In a sense, the combination of UnitRange and ClosedInterval. ClosedInterval can be made to work with this, by defining methods for some types. But not, of course, for <: AbstractFloat and similar. If I make PRs for IntervalSets (as I recently did), it would lead to a situation where some methods (eg iteration) work for some subtypes for ClosedInterval, but not for others. Would you be OK with this?
In any case, Discourse is now warning me that I am talking to you too much
Thank you for taking the time to discuss, I will keep working on the actual data analysis problem that motivated this whole topic for me (insurance spells, delimited by dates, I need to check spells for overlap, intersections, etc) and will see what API I would need to make that easier.
Goodness, I didnāt know it did that. In replying to this Iām getting the same warning. For the record, I donāt think 3 messages proposing interesting design ideas about an important topic is too much
.
Anyway, thanks for reminding me about your intended limitation on :. We can discuss iteration over elements of ClosedInterval in IntervalSets.