Discussion about integer overflow

StefanKarpinski · October 12, 2021, 3:05pm

But correctness of those results is in they eye of the beholder. What if the next thing you do with that value is ask whether it’s odd or not?

octave:9> rem(3^40, 2) == 1
ans = 0

julia> isodd(3^40)
true

The Octave answer is completely wrong whereas the Julia answer is correct (and would be for any integer arguments). Perhaps you work in an area where only the general magnitude of a result is important, in which case, yes, 1.2158e+19 is a better answer. And if that is the case for you then you should use floats and you’ll get as good answers as you would in languages that do everything with floats. But that’s not universally the case. There are people for whom the last bits of an integer computation are just as important as the first bits (they should use BigInts); there are also many situations where only the last bits matter (Ints are great for this).

rafael.guerra · October 12, 2021, 3:26pm

And one more cherry-picked example:

julia> (3^40 + 1) - 3^40
1

octave:1> (3^40 + 1) - 3^40
ans = 0

viraltux · October 12, 2021, 3:34pm

That’s true, and at the end of the day Julia will have a community of users that will decide if the language fits the purpose for their work/research.

That is why I was trying to drive the conversation into the compliance part for business rather than if we can trick Octave into a wrong result, if we can, then Octave won’t be accepted either by certain regulatory bodies that will demand guarantees that correct formulas relevant for their business return correct results.

The concerns and information I shared is from someone that is close to the business side of things. The questions I raised are the one that will be raised by the business and not always in front of you when doing a presentation.

If these people has the slightest concern that one of their mathematically correct models can return wrong results they will not risk it, in fact, even if they want to risk it, the slightest comment to the lawyers in the compliance team will prompt and immediate answer that will stop them from doing so.

Now, Julia is still an awesome language for research, I think you guys have done an amazing job, but some professionals might not be able to enjoy it for the reasons described above and I thought my experience from the business was wroth sharing.

StefanKarpinski · October 12, 2021, 3:39pm

It isn’t a “trick”. The Octave result is wrong—in a different way than the Julia integer result, granted, but still wrong. Moreover, it’s impossible to create a system where the answer is always correct, so if that’s the requirement for regulatory compliance, then there will be no compliant languages.

I don’t mean this to come off snarky, but you do know about all the issues that exist with numerical stability of floating-point computations, right? If you think that you can write down a mathematically correct model in Matlab or Octave or R and that it’s guaranteed to give an accurate result, then whoa boy, I’ve got some bad news for you…

vjd · October 12, 2021, 3:46pm

just an FYI… Pumas based files are now accept as Julia scripts by the FDA… https://www.fda.gov/media/85816/download
check out the March 15th update

viraltux · October 12, 2021, 3:52pm

But you can make results as robust as technically possible…

I’ll be a bit dramatic now but if somebody dies because a Pharma model advises the wrong dosage due to an overflow error what answer do you want to have for the judge and to the families of the victims?

1 - We used the language which to our best knowledge had better checks.
2 - We used the fastest language knowing the slowest one offered better checks.

Answer number two will land you with a cell mate whose name is probably not Julia…

lmiq · October 12, 2021, 3:55pm

I think this became too speculative. Is there a real example of that kind of compliance to show?

NASA has landed on the moon using probably Fortran with 32bits arithmetic, I’m pretty sure they were quite careful about correct results.

vjd · October 12, 2021, 3:57pm

Not meant to be offensive, but if a Pharma dosage is entirely dependent on a model and not common sense, then we are all in trouble. Models tend to objectify our common sense based on data. If a model prediction does not agree with data, what would you conclude?

lungben · October 12, 2021, 4:02pm

64-bit integers can get quite large before they overflow - roughly 9*10^18.
Most of the time, integers are used for counting and indexing. For these purposes, overflow is not an issue, but performance is. Therefore, overflow safety checks would do more harm than good.

I doubt that they are many use cases where Int64 overflow is even a remote possibility. And if you have such an exotic use case, you should be aware of it and use an appropriate data type.

viraltux · October 12, 2021, 4:03pm

As the says goes “common sense is the least common of the senses” that’s why you need models and data combined, a potential error in a dosage does not need to be something spectacular like 3Kg instead 3mg but something like 30mg instead 3mg.

viraltux · October 12, 2021, 4:05pm

When it comes to compliance one in a million chance is one too many if there is an alternative.

lmiq · October 12, 2021, 4:06pm

Sincerely, I think that if you have an actual example of a compliance rule that requires such kind of language characteristic, this would be an interesting topic for discussion, possibly stimulating the development of a package that specifically addresses the requirements of those rules.

tim.holy · October 12, 2021, 4:08pm

I guess I’m a bit puzzled about how this discussion is going. @viraltux you seem to really want to make constructive points and have been respectful throughout. Kudos. At the same time, the central message is this: (1) there are no existing systems that give you what you’re asking for, period; (2) a lot of previous thought by people with very deep understanding of the technical details has already happened; and (3) Julia, almost uniquely among languages, allows you to choose which set of tradeoffs matter to you.

Rather than advocating to change the language (you will never even make headway on that discussion), can I suggest that we morph this into a discussion of which packages fail to gracefully accept SaferInteger inputs and what technical changes need to be made to get them used throughout an entire stack of code? That puts the onus on you, but the point is you care a lot about this and quite a few people have already written the core tools you need in order to do this in the domain you care about.

goerch · October 12, 2021, 4:09pm

https://wiki.sei.cmu.edu/confluence/display/c/INT32-C.+Ensure+that+operations+on+signed+integers+do+not+result+in+overflow

lungben · October 12, 2021, 4:21pm

The question is where there is a risk of one in a million (or larger) for an integer overflow.

Loop variables: assuming one loop iteration in 10^-9 s, it would take a computer about 300 years to overflow.
Memory: assuming that the integer adresses single bytes, this would correspond to 9 * 10^18 bytes or 9 * 10^6 TB (even when wasting the adress room of negative integers). This should be out-of-reach for quite a while (except maybe for the largest supercomputer).
Currency amounts: for accounting, etc. integers may be used to represent cent amounts if rounding errors are not acceptable. 10^18 cents (or Yen, or whatever major currency) are quite a lot compared to the whole world economy. But here a hyperinflation (similar to Germany 1923 or Hungary 1946) may actually give you values where overflow is possible. This could be avoided by choosing an overflow-safe data type. But actually I am quite sure that the IT systems of all banks and financial infrastructure providers would collapse due to
technical reasons if such a hyperinflation happens again in an industrialized country.

Edit: @viraltux may I ask to what industry you refer to?

viraltux · October 12, 2021, 4:22pm

Thank you Tim.

(1) No, but some systems are better than others and more compliant than others.
(2) Unfortunately in business Technical people opinion might not matter that much… I am still using Windows.
(3) Yeah but my point is not so much technical as it is legal.

I am not advocating to change the language per ‘se’; the --safe flag option possibility that I suggested should leave the language as it is for those not using the flag.

I am just bringing attention to potential areas where Julia might not be accepted as a language to go.

Thank you very much for you input Tim, I appreciate it.

jw3126 · October 12, 2021, 4:27pm

yes. Throwing an error is a massive slowdown.

There could be a julia --check-overflow option that is much slower but checks for overflows.

tim.holy · October 12, 2021, 4:36pm

Thanks for re-emphasizing that. I think that --safe might be overselling what’s (easily) possible (I like --check-overflow better), but a close approximation seems like it might be feasible, if perhaps a lot of work. Key points:

Julia allows you to override include, and in fact does so at several points during its bootstrap (when it is building itself)
The newish mapexpr argument to include might allow one to automatically substitute SafeInt for Int pretty much throughout when loading & building package code; the authors of the packages would still write their code using Int but on systems you have built they would be translated into SafeInt.
someone might need to alter the parser to allow it to construct SafeInt from integer literals

I wouldn’t promise this would do the job, and it would be quite a lot of work, and you need to be aware that there will be major hits to Julia’s legendary performance. (It seems unlikely to erase all of Julia’s advantages, and Julia would likely be faster than any other “safe” system, but it’s not going to perform anything like “vanilla” julia.) But in principle, this seems like something that has a fair amount of promise to constructing the kind of system you can use.

I doubt anyone will do this work for you for free, though, and Since you’re from pharma, what about seeing if they can commission someone to do the work?

Sukera · October 12, 2021, 4:36pm

If you want to practice very defensive programming, you can use the methods from Base.Checked:

julia> Base.Checked |> names
15-element Vector{Symbol}:
 :Checked
 :add_with_overflow
 :checked_abs
 :checked_add
 :checked_cld
 :checked_div
 :checked_fld
 :checked_length
 :checked_mod
 :checked_mul
 :checked_neg
 :checked_rem
 :checked_sub
 :mul_with_overflow
 :sub_with_overflow

those methods will check for overflow on every operation. This comes with the obvious caveat of not having exp etc. though, so it’s mostly useful for implementing core algorithms providing e.g. checked_exp, not for direct use.

Oscar_Smith · October 12, 2021, 4:38pm

for functions like exp you are returning a floating point answer anyway, so you don’t need to worry about integer overflow.

Topic		Replies	Views
Is this a bug General Usage bug	1	506	August 1, 2022
Multiplication issue New to Julia question	9	392	May 17, 2023
Integer power of large Integer returning very wrong results General Usage base	7	528	May 15, 2023
Different results between 10^50 and 10.0^50 New to Julia	10	310	April 25, 2025
Overflow behaviour - how to explain General Usage question , integer-overflow	9	2211	December 21, 2016

Discussion about integer overflow

Related topics