Why does scientific notation break the range function?

I think you’re arguing here for a “do what I mean” behavior. Floating point “integers” like 1e10 usually don’t match their closest integers perfectly, but there’s rounding error involved. So saying something like “9.9999999” is 10 just because it’s very close to 10 invites a lot of bugs and weird behavior. For example, how close should a floating point “integer” be to its next integer so that it is allowed to be used as one? I personally like the decision to just squash that whole class of bugs by requiring integers. If you want the language to “do what I mean” in one simple situation like 1e10 you often have to deal with bad egde cases in other circumstances that are not immediately obvious.

14 Likes

In general, Julia is actually very accommodating when it comes to accepting varying input types. Much more so than Matlab. When calling built-in Matlab functions, you will often see that only double is accepted (perhaps one or two more built-in types), because it calls into compiled C++ libraries.

Julia functions are very often highly generic, and accept any type that “quacks like a duck”.

What happens if you try

linspace(int64(0), int64(1), 10)

in Matlab? This works in Julia, with all sorts of integers, (after fixing the names).

In Matlab things are built around double, and almost everything is a double. Stray from that and you soon get in trouble.

2 Likes

The error thrown by Julia is actually pretty good in this case, and almost the same as Python gives:

julia> range(0, 1, length=1e2)
ERROR: TypeError: in keyword argument length, expected Union{Nothing, Integer}, got a value of type Float64
Stacktrace:
 [1] top-level scope
   @ REPL[1]:1

vs

In [1]: import numpy
In [2]: numpy.linspace(0, 1, 1e2)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-5b8b1326f642> in <module>
----> 1 numpy.linspace(0, 1, 1e2)
<__array_function__ internals> in linspace(*args, **kwargs)
/usr/local/lib/python3.9/site-packages/numpy/core/function_base.py in linspace(start, stop, num, endpoint, retstep, dtype, a
xis)
    111 
    112     """
--> 113     num = operator.index(num)
    114     if num < 0:
    115         raise ValueError("Number of samples, %s, must be non-negative." % num)
TypeError: 'float' object cannot be interpreted as an integer

Allowing floating point values would open a whole another can of worms. For example, 1e10 == 10^10, but 1e20 != 10^20.

14 Likes

Unfortunate example since the left hand side can be represented exactly but the right hand side overflows.

julia> 1e20 == BigInt(10)^20
true

julia> 10^20
7766279631452241920
4 Likes

Indeed, you are right! A better example would probably be Int(1e16 + 1) == Int(1e16) which would make range(... length=1e16 + 1) very confusing.

1 Like

7 posts were split to a new topic: Frustrated with frustrated beginners

I’m wrong about the syntax being free, currently 1E6 === 1e6. So this would be a breaking change for a small gain. Ignore.

3 Likes

Good catch. That’ll be fixed by

9 Likes

Actually, they “usually” do match: Float64 floating-point numbers represent all integers exactly up to maxintfloat(Float64) = 2^53, about 10^16.

Moreover, because 10 contains a factor of 2, binary floating point represents many integers n \times 10^m = (n \times 5^m) \times 2^m exactly up to much bigger values, up to about 1e22 — even in single precision (Float32), which only represents all integers exactly up to maxintfloat(Float32) = 2^24 ≈ 1.6e7, you can represent “scientific-notation” numbers 1fN exactly up to 1e10. It only goes wrong at 1e23 == 99999999999999991611392 (in Float64) or 1f11 == 99999997952 (in Float32).

In short, Float64 literals are exact for essentially any practical array length.

The main concerns in the discussions that I linked above were different. For example, if you accept floating-point lengths in a few functions for convenience, which functions? It would be oddly inconsistent for them to work in only some functions (e.g. ranges, zeros, …) but not others, but then you have an ever-expanding set of conversion methods that you need to remember to write.

17 Likes

I think we should make an exception here, for pragmatic reasons.

My argument is that, despite what you write floats representing “many integers” exactly (well because), but not all, and length an integer (counted, never measured), I realized no float for length would be problematic:

julia> range(0, 10, length=1000)  # returns same as range(0e0, 10e0, length=1000)
0.0:0.01001001001001001:10.0

That looked awfully binary-like to me at first. :slight_smile: I realized despite the integer count, it must be converted to a Float64 because of a division somewhere, so why not allow both operands Float64? You can construct the same range with all 3 numbers floats, or as there mixed with first and last numbers integers:

julia> 0:0.01001001001001001:10
0.0:0.01001001001001001:10.0

With ranges hugely important, even with the above special syntax, allowing mixing types, I think we should make the exception in this case (sort of for consistency with it).

I was thinking we need to check for extreme values for floats and disallow, just was we need for negative “lengths” (so run-time check likely not an objection for performance reasons):

julia> range(0e0,10e0,length=-1000)
ERROR: ArgumentError: range(0.0, stop=10.0, length=-1000): negative length

Right, and while technically you’re not constructing an array until you do e.g. collect(range(0e0,10e0,length=10)) or use the range with an array in other ways.

Jeff wrote on github (regarding shortening stack-traces), so I doubt it will happen in general (while Karpinski has some exception for code that Julia shows but Python wouldn’t since there implemented in C)::

This has come up a couple times in the past and I generally resist hiding information.

1 Like

Rather than upend the entire language, you can

myrange(start,stop,len) = range(start,stop,length=Int(len))
julia> myrange(0,10,1e3)
0.0:0.01001001001001001:10.0
2 Likes

I would say it overstating it upends “the entire language”, but that’s not going to be helpful to beginners… :slight_smile: And I was thinking extreme values need checking (or not?), so this might be buggy code, but a step in the right direction for Base.

1 Like

Curious discussion here. Poor OP! As a teacher we learn to deal with frustrating learning curves, and with the fact that the questions are almost always the same, it is the people that change :slight_smile: .

My answer to this would be, simply:

Julia makes a clear difference between integers and floats, so the practical syntax for that is

range(1,10,length=10^3)

because literal powers of integers are integers, we can write

for i in 1:10^3
end

I don’t know if Matlab accepts 1e3 there, for example.

And note that this is another difference from other languages, perhaps Matlab, such that be careful with

julia> div(10^2,10^3)
0
10 Likes

Please don’t. Being consistent with Ints and not allowing floats is going to be a lot less confusing in the long run. Making the language optimally welcoming for the first week of use, and annoying for the next 20+ years is not the right tradeoff, imho.

13 Likes

It’s actually kind of awesome to read how the different members of the forum think about these problems. Some of it is totally over my head, but an interesting discussion, nonetheless!

11 Likes

I would rather point out that Matlab accepts those inputs because it does not really know anything else but a double. :wink:

Seriously, being able to catch type errors is valuable. The argument is optional, which means it could be nothing. (So far with me?) Hence, a union of Nothing and Int.

5 Likes

I think this is true for older versions of Matlab. Nowadays there are, e. g., 16 bit integer arrays. Probably a value like 1000 by default still becomes a 1x1 double matrix, but you could force it to be a 1x1 int16 matrix.

Integer types have existed in Matlab for quite long but literals (e.g. 1000) are doubles, so it would be unreasonably inconvenient to require integer types to be used as arguments to common functions, not to speak of backwards compatibility issues with changing it.

4 Likes

For what it’s worth, LinRange gives a better stacktrace (and is a cleaner drop-in replacement for linspace):

julia> LinRange(0, 1, 1e3)
ERROR: MethodError: no method matching LinRange(::Int64, ::Int64, ::Float64)
Closest candidates are:
  LinRange(::Any, ::Any, ::Integer) at range.jl:563
Stacktrace:
 [1] top-level scope
   @ REPL[14]:1

julia> range(0, 1, length=1e3)
ERROR: TypeError: in keyword argument length, expected Union{Nothing, Integer}, got a value of type Float64
Stacktrace:
 [1] top-level scope
   @ REPL[15]:1
1 Like

The Julia community is very particular in this sense, as any post here gets answers from newbies, relatively new users, computer engineers working at Julia Computing, language founders, world-class computer scientists, all together. And many of the new questions bring back heated discussions of the past, but even then it is curious that it is common that we can find and link the issue in github where the decision was taken, sometimes, as here, many many years ago.

Nobody knows, at the same time, the background of the poster. So he/she maybe someone with a strong background in computer sciences, or someone that is just starting to script something. I have more than once answered a question and, later, realized that the poster was someone with a much more deep knowledge of the question than me. Probably having to read my answer was irritating. And the same thing goes otherwise, sometimes people get answers here that are way far from what they have as a background. The good thing is that with some persistence the amount we learn about computer and programming is huge.

24 Likes