Discussion about integer overflow

That’s theoretically doable, but it has a problem that there does exist code that intentionally relies on integer overflow behavior. In order to implement such a mode, a distinction would have to be introduced between “operations that are not meant to wrap” and “operations that are meant to wrap”. For example, in sorting and search code, the idiomatic way to find the midpoint between two indices is (lo + hi) >>> 1. This idiom gives the correct answer even when lo + hi overflows:

julia> lo = typemax(Int) - 10
9223372036854775797

julia> hi = typemax(Int)
9223372036854775807

julia> lo + hi
-12

julia> (lo + hi) >>> 1
9223372036854775802

As it stands, you cannot distinguish this kind of intentional usage from usages where integer wrapping is incorrect and should cause an error.

9 Likes

If such uses are relatively sparse, perhaps they could be annotated? @deliberate_overflow begin ... end.

5 Likes

I would love for this to happen! And I will try to push the idea when possible.

5 Likes

That seems pretty much what we can expect: C is not conformant, by itself, with that requirement. That does not make C blacklisted. There are some compilers that have flag that make the code conform it. Seems pretty much the place for a package that is built to conform the required rules for that specific field. That seems a reasonable development path.

1 Like

Ah nice example. I think when computing hashes, intentional overflow is quite common. And hashes are pretty much everywhere e.g. via Dict.

3 Likes

I think this claim is where this thread has gotten so far off the rails. To make this claim, you ought to do a few things:

  1. Define a formal metric of correctness.
  2. Measure that metric of correctness systematically against existing systems you deem relevant.
  3. Demonstrate that Julia does more poorly.

The absence of rigor here ensures this debate will degenerate into pure emotion and provide no benefit to anyone involved.

14 Likes

Just as a datapoint, integer overflow in Rust panics in debug mode while it wrap around in release mode. And there is precisely a special operator if you intend to wrap around so it doesn’t panic even in debug mode. Julia code doesn’t really have a debug mode now but could be worth thinking about adopting something like that.

11 Likes

We could potentially introduce wrapping arithmetic operators (syntax ideas: +% or %+?) which indicate intentional overflow and then use them, but yeah, it’s a whole process. And I’m not entirely convinced that it’s actually better. The situation in C is actually rather different than Julia: in C signed integer arithmetic that overflows is undefined behavior, so it’s very valid to have a compiler that warns you about it. In Julia, integer arithmetic is not undefined at all, it’s explicitly defined to wrap around doing correct modular arithmetic, which is a useful and valid mathematical operation.

I realize that I’m a bit unusual in this, but I don’t even think of fixed-size integer types as approximating ℤ, the ring of integers. Instead, I think of them as implementing fully correct arithmetic in the modular ring ℤ/2ᴺℤ. So when I see Int8(123) + Int8(45) == Int8(-88) it doesn’t seem wrong—that is the correct answer modulo 256, which is what you’ve asked for by using the Int8 type:

julia> mod(123 + 45, 256)
168

julia> mod(-88, 256)
168

From that perspective it’s not only not surprising that 10^1000 == 0, but it’s actually the only correct answer because if we do 10^{1000} in full precision and then reduce modulo 2^{64} that’s what we get:

julia> mod(big(10)^1000, big(2)^64)
0

Mathematically, the reason the result is zero is because 2 divides 10 and 1000 ≥ 64 so 2^{64} divides 10^{1000}. The same zero result doesn’t happen if your base isn’t divisible by 2:

julia> 3^1000
6203307696791771937

julia> big(3)^1000
1322070819480806636890455259752144365965422032752148167664920368226828597346704899540778313850608061963909777696872582355950954582100618911865342725257953674027620225198320803878014774228964841274390400117588618041128947815623094438061566173054086674490506178125480344405547054397038895817465368254916136220830268563778582290228416398307887896918556404084898937609373242171846359938695516765018940588109060426089671438864102814350385648747165832010614366132173102768902855220001

julia> mod(big(3)^1000, big(2)^64)
6203307696791771937

The result 6203307696791771937 here is the right answer, you just weren’t asking the question you thought you were. Note how different this is from floating-point instability: in the case of floating-point error, the result is truly just wrong, there’s no sense in which it’s correct.

So from my perspective, this whole discussion has a lot of unwarranted “floating-point primacy” assumptions. I.e. that the ways in which floating-point works are “more correct” even though the rounding and loss of precision is rampant. When you’re doing modular integer arithmetic, there is one and only one correct answer and that’s what native integer operations give you. Yes, that answer isn’t what you expect if you think that Int approximates ℤ, but it is guaranteed to be equal to the answer you’d get in ℤ modulo 2^64. I have a very hard time being convinced that it’s actually wrong for integer types to be defined to do modular arithmetic.

22 Likes

The counterpoint to my own position is that a lot of people do expect the integer types to act like ℤ and write code that incorrectly fails to account for the possibility of integer overflow, which can cause bugs and security holes and whatnot. Or if (for some bizarre reason) someone is computing a drug dosage using Int values, and the correct dosage is larger than 2^64, then it could accidentally give someone a negative dosage.

7 Likes

Absolutely, that’s how people should think of these - the above debate is moot under this model.

However Julia also allows implicit promotion of Int to Float64, which under this model doesn’t make much mathematical sense: there is no ring homomorphism from ℤ/2ᴺℤ to ℝ (if we think of floats as approximating ℝ).

Perhaps allowing this promotion for fixed-size Ints was a mistake.

OTOH Int <: Real so this model is not consistent with the type hierarchy names.

2 Likes

The type hierachy in julia is for dispatch not for ontology.
For example ForwardDiff.Dual is also a subtype of Real.
Which is mathematically nonsense

6 Likes

Compliance & validation — even for Pharma — is not a static thing and it’s not just about one particular stance on correctness. It’s a process. It’s a process that can fully support different approaches to integer overflow. It’s a process that can support floating point inaccuracies. It’s a process that can support systems that have null pointers.

Compliance really isn’t just about the tool itself — it’s more about how you use it.

16 Likes

You are not alone in thinking of computer integers as ℤ/2ᴺℤ, not ℤ.

And it is not possible for a finite, physical, computers to represent all of ℤ or do arithmetic over all of it. Even with arbitrary-precision types, you will run out of memory or swap space eventually. There is an upper bound.

I don’t think of this as a failing of properly representing math, so much as a discrepancy between the hyper-idealized assumptions of math compared to the finite universe we live in. And that it’s very important for numerical computationalists to understand computer arithmetic standards and their relation to idealized mathematics.

7 Likes

Imagine a construction crew builds a deck, and then it collapses, killing the occupant. In the trial they say “your honor, the real fault of this is the Stanley tool company, every time we tried to drive the screws with their hammer, it kept bending or breaking the screws”.

The fault here is not the hammer manufacturer, it’s the crew that thinks driving screws with a hammer is an acceptable practice. Similarly, in Julia if you need floating point type arithmetic you need to tell Julia to DO floating point arithmetic, not have it guess that when you wrote 2^68 you meant 2.0^68

8 Likes

I half agree with this, but there is no interface for Real so its not clear what behaviours the type is capturing.

And even if I totally agreed, “Real” is quite the loaded term so you can forgive people for having certain assumptions for how they behave - real numbers do not wrap around.

1 Like

Sounds like a failure to validate your software to me, regardless of the source of the bug or choice of implementation language.

For a different perspective on integer overflow you can think of a scenario where your imaging equipment goes black at a critical point of an operation because you failed to catch an exception when your safe integers overflowed, rather than yielding one corrupt pixel on your megapixel display due to wrapping around.

In both cases you probably shouldn’t have done integer arithmetic in the first place, and have better testing and validation practices.

3 Likes

This reminded me of Ariane flight V88 - Wikipedia

1 Like

See e.g. GitHub - JuliaMath/ChangePrecision.jl: macro to change the default floating-point precision in Julia code for how this can be done for a block of included code, recursively (that package focuses on changing floating-point types, but it should be straightforward to do something similar for integer types).

In practice, this is a fun hack, but in production code I think you are better off (a) being aware of how computer arithmetic works and using the right numerical type for the job, e.g. don’t use Int where you want a floating-point calculation, and (b) write type-generic code so that your code chooses a precision based on the inputs — that way you can run the same code without modification in whatever precision you want (or things like dual numbers for automatic differentiation and other exotic numeric types).

7 Likes

One place where I’ve seen challenges is image processing, where it’s quite common to use 8-bit modular numbers to denote intensities as a means of saving memory. But it leads to many traps for the unwary: a large fraction of computations should probably be immediately auto-promoted to floating-point or some much wider modular type.

This is no different from what you’re saying, other than “time-to-first-surprise”: issues crop up quite quickly for 8-bit modular arithmetic and much more rarely, in real-world scenarios, for 64-bit modular arithmetic. That suggests that many rely more than we should on the fact that modular arithmetic happens to approximate non-modular arithmetic much of the time. On the other hand, encountering it quickly and early means people get trained in their thinking almost immediately, and perhaps that’s not a bad outcome; I’m actually a bit surprised how few times the issue has come up in our issue trackers. Maybe that’s because all other suites do roughly the same thing as us.

13 Likes