Discussion about integer overflow

viraltux · October 11, 2021, 12:44pm

Yes, it is a limitation but I would not call it big since we could use BigInt if Int64 is not enough and an overflow is something we can catch.

I find more worrying this:

> julia> 1 / 10^20000
> Inf

I’d rather have an overflow here than Inf, or at least an Infinitesimal value… Maybe something we could ask for future versions of Julia.

johnmyleswhite · October 11, 2021, 12:49pm

> julia> 1 / 10^20000
> Inf

This example seems messy since it’s conflating integer overflow and FP overflow:

10^20_000 evaluates to exactly 0.
So the expression is really 1 / 0, where Inf seems like the right answer.

Of course, one can say that you’d use BigInt everywhere, but there’s major tradeoff in performance then.

viraltux · October 11, 2021, 12:51pm

Well, then the problem is that 10^20_000 should evaluate to Inf, I think Julia language should have a review of these overflow scenarios.

GunnarFarneback · October 11, 2021, 1:24pm

Integer overflow has been thought about since at least 2012, is documented in the FAQ, has been discussed repeatedly, has been implemented in a package, and finally Julia is not at that stage of development anymore.

viraltux · October 11, 2021, 1:56pm

Very thorough answer! Thank you very much for the links @GunnarFarneback

I will however respectfully disagree with this design decision though, I would say people doing analytics would much rather have an overflow error thrown by Julia than having incorrect results in their calculations.

StefanKarpinski · October 11, 2021, 2:19pm

How do those people feel about for loops being slow? Why is that relevant, you ask? Because for loops are implemented using integer arithmetic. You slow down integer arithmetic and you slow down the entire language.

viraltux · October 11, 2021, 2:45pm

I would say those people might expect that in a high level language intended for numerical analysis correct results are above performance and they might expect not to have to worry about C++ like considerations when handling numbers. I would also say that, if they want performance, they would also like to have a package like FastIntegers.jl (just an example) knowing the kind of problems they might run into if they use it.

However, if I understood you correctly, it seems that Integer arithmetic is so intertwine with Julia that any safekeeping would cause a general slowdown. I have a question if I may, would just throwing an error also slow down the entire language?

I am asking because there might be strong regulatory implications; people lives might depend on the results of some analytics.

Oscar_Smith · October 11, 2021, 2:50pm

yes. Throwing an error is a massive slowdown. To do so, you have to check overflow for every operation, and the possibility of throwing errors means that the compiler can’t optimize your code since things like vectorization rely on arithmetic not having side effects.

Also, note that Julia has a library SafeIntegers.jl which you can use if you want errors on overflow.

gbaraldi · October 11, 2021, 2:53pm

There is also https://github.com/milankl/Sherlogs.jl which is nice to check for some floating point errors, specially with float32/16

viraltux · October 11, 2021, 2:54pm

Thank you Oscar for confirming this point. Okay then, it is the way it is, and certainly some regulatory bodies might require SafeIntegers.jl to accept calculations, something good to keep in mind.

Thank you all for your help, I’ve learned something today!

StefanKarpinski · October 11, 2021, 3:22pm

At this point I’ve been involved in a lot of discussions with customers using Julia in a variety of heavily regulated industries—finance, pharma, medicine, insurance, aviation, aerospace, etc. While they have many concerns, I can’t say that anyone in any of those industries has ever expressed concern about integer overflow as a regulatory issue. If it was a big issue, it would rule out the use of many languages in those industries, including Fortran, C, C++, Java and C#. Moreover since every fast language has integer overflow, even if you use Python, most of your computation is still happening in a fast language that has integers that overflow. So while I agree that in an ideal world we wouldn’t have to make this trade off, we live in a world where checking every integer operation is prohibitively costly, and if we want Julia to be a language in which you can write code that’s as fast as possible (which we do), this is the call we’re forced to make.

lungben · October 11, 2021, 5:20pm

Regarding Python: the standard int type does not overflow and is thus comparable to BigInt in Julia. But it is also very slow.
If you want to speed up Python using Numpy, Numba or Cython machine-type integers are used which do overflow without warning.

czylabsonasa · October 11, 2021, 6:19pm

1/10^20000=1/0 should be Inf, it is about integer arithmetic, (10^20000=0)
perhaps you meant 1/10.0^200000=1/Inf=0.0

viraltux · October 12, 2021, 8:39am

Well, now you have found the first one, I work for Big Pharma and in some of my past projects closely so to the Pharmacokinetics and Pharmacodynamics crowd. These people need to have their models approved by regulatory bodies, how do you think they will react when they find out that Julia, by design, accepts 1 + 1 = 2.12 ?

julia> 1/(1-10^49/10^63) + 1/(1-10^49/10^63) # ~1 + ~1
2.123792857605807

R>     1/(1-10^49/10^63) + 1/(1-10^49/10^63) # ~1 + ~1
[1] 2

They might ask the following question and rightly so, “wait a minute, you are telling me that my model can be mathematically correct but that I cannot expect mathematically correct results?” If now we mention Julia speed they only thing they are going to hear is how fast Julia fails.

I am using R language as an example because R is widely utilized in Pharma when regulations matter and this is one language they use to have their models approved.

Regulated industries are not regulated in every step of their way, for instance, I have also been involved in Biomedical Imaging projects, in those projects we could use anything we want, any language, any hardware, anything, and that’s because when it comes to research we don’t have any technological regulatory constraints.

However, if you need your research to be approved that’s a different world; for instance, even though R is accepted by regulatory bodies in Pharma is not all R, all versions and all packages that are accepted.

There is not one single language intended for numerical analysis that I know of (SAS, R, S, Matlab, Mathematica, Maxima, Octave, SPSS,… a few others), not one, that allows for incorrect arithmetic of the kind I showed you above. Not one except Julia now.

Obviously, these languages might all be using C++ or Fortran under the hood, but they all still guarantee that a correct formula returns correct results.

Stefan, I believe that when we try hard often we can find ways to have the best of both worlds…

For instance, how about implementing a flag for a safe Julia (e.g. julia --safe)? This way people worried about regulations could develop their models in the standard fast julia, but have them approved by running the very same models in julia --safe mode.

This would make Julia not only the fastest language, but the safest too since, in safe mode, all kind of fancy safety measures could be put in place with no concerns for speed… Just an idea.

goerch · October 12, 2021, 9:06am

Regarding floating point arithmetic I’ve found LLVM Language Reference Manual — LLVM 16.0.0git documentation. Are these supported by Julia?

Regarding checked integer arithmetic clang/llvm seem to recommend to use sanitizers: UndefinedBehaviorSanitizer — Clang 16.0.0git documentation. There is no equivalent for Julia AFAIU?

rafael.guerra · October 12, 2021, 9:22am

Could you write it as exp10(49)/exp10(63) ?

viraltux · October 12, 2021, 9:52am

I could, but how would you know if the developers of package you are using did?

Sukera · October 12, 2021, 9:52am

You’re calculating numbers way outside the domain of integers, so why not go with floating point numbers in the first place?

julia> 1/(1-10^49.0/10^63.0) + 1/(1-10^49.0/10^63.0)
2.00000000000002

This also shows that R is lying to you - ~1 + ~1 is only ~2 after all and not exactly 2.

Interestingly, I believe the majority of languages you’ve quoted do actually have the same “problem” - they just mask it by showing fewer decimals than would be required to accurately represent the true number. For example, both Matlab and Mathematica lie to you in the same way. See here for Matlab and here for Mathematica:

By default, the inputs 0.1 and 0.2 in the example are taken to have MachinePrecision. At a common MachinePrecision of 15.9546 digits, 0.1 + 0.2 actually has a [FullForm][4] of 0.30000000000000004, but is printed as 0.3.

Arguably, your example is hitting a different failure mode (integer overflow) than just floating point imprecision, so I guess that’s a point to be made? For that though, SaferIntegers.jl has been suggested and would show you the problem right away.

cjdoris · October 12, 2021, 10:02am

Unfortunately your proposed safe mode would actually change the semantics of Julia - the overflow behaviour of Int64 and friends is part of the language.

This is all documented. Ints should only be used for “small” integers (such as for counting things). For large or continuous quantities there are plenty of alternatives: floats; big ints; checked integers.

In your example, replace 10 with 10.0 to get

julia> 1/(1-10.0^49/10.0^63) + 1/(1-10.0^49/10.0^63)
2.00000000000002

which is actually more correct than what R prints.

GunnarFarneback · October 12, 2021, 11:13am

How well do you know those languages?

octave:2> x
x = 128
octave:3> x + x
ans = 255

Topic		Replies	Views
Julia messes up integer exponents? Performance integer-overflow	106	6463	February 21, 2019
A plea for int overflow checking as the default Internals & Design question , proposal , integer-overflow	80	11762	December 17, 2021
Potential solution to the overflow problem; 64-bit considered harmful, 32- or 21-bit better General Usage integer-overflow	6	3640	October 18, 2021
Tonight I hated julia, REPL and overflows New to Julia integer-overflow	64	6312	March 14, 2020
The speed of light is an integer. Why should we care? Internals & Design integer-overflow	78	8655	September 7, 2020

Discussion about integer overflow

Related topics