Discussion about integer overflow

Nit: I would say that 37 and 37.0 represent the same real number — that is why they are == — but that operations like + are defined differently (though they agree until the precision is overflowed) for different subsets of the reals described by different types.

That being said, it’s relatively rare that you actually want the 2s-complement overflow behavior, so if it weren’t for performance considerations it would be reasonable for the default + to not do silent 2s-complement wraparound, and have some other operator/function for the rare case where you want this.

The fact that 2s-complement — or often, undefined behavior on integer overflow! — is the default in most compiled languages really is a performance tradeoff, I think.

5 Likes

I considered not adding the preamble in my post because it distracts from the concrete point about closed source languages (there are real-world examples, eg regarding Mathematica and Bessel functions). This is a very public forum, so it’s important to refute damaging statements that are incorrect. But, that had already been done explicitly in other comments, and my preamble added nothing.

I agree that being sympathetic to the extent possible is the most productive choice. At times, we all find some design incongruous or stupid only because we don’t understand the bigger picture. That @viraltux expresses a strongly held opinion with non-negligible currency seems to have been somewhat productive. But, it is difficult to have a productive discussion when people fail repeatedly, even after being directed to a number of resources, to exercise some well-founded epistemic humility.

You made a very sweeping statement, that is demonstrably false in my experience with healthcare IT:

That is what I was responding to.

I never said that M/MUMPS was a great language for those sorts of use cases - however, for many decades, it was the best language available for many sorts of healthcare applications, and one that didn’t make it difficult for the domain experts to write code that was safe, and saved lives.

These days, if I were writing such applications, I’d definitely use Julia, because then I have the power of using IEEE decimal floating point for the cases where that does makes sense, as well as Int64, Int128, Float64, Rational, or whatever makes sense in other cases, all in a very performant, easy to understand, high level language.

In M/MUMPS and their successors such as Caché ObjectScript, values are strings, it is the operations which determine whether the values are used as a numeric value or not. "123" + "1" returns 124, because the + operator (as well as -, *, /, **) treat the operands as numbers. Internally, things are optimized, so that you don’t spend all of your time converting back and forth between string and numeric representations when trying to do arithmetic.
It is totally the opposite of the way Julia (and most other languages) work, true! (but that doesn’t make it an invalid approach, either)

Doesn’t this waste a bit of energy?

If that were really the case, why even bother with anything but Float64, which is what R, Matlab, JavaScript, and other languages seem to have done? What’s the point of Int128, BigInt, Dec128, etc. then?
14 exact digits is not always enough.
It still doesn’t deal with the case where simple scaling by powers of 10, or adding up values (add .1 10 times) gives results that are unexpected to people used to decimal arithmetic (and can be incorrect if you are trying to deal with things like currencies and dosages).

If you are talking about the approach MUMPS took, back in the 60s, it was quite efficient (and still was, until the early 90s when IEEE 754 hardware became more widely available - the Intel x486 wasn’t released until mid 1989). Numeric values could be packed into delimited strings, and operated on easily. Most operations were fairly simple arithmetic, usually addition and subtraction by values with the same scale (exponent), scaling by powers of 10 just meant addition/subtraction of the exponent value.
Back then, without any standard binary floating point hardware, with often only software libraries for binary floating point, decimal arithmetic was actually frequently faster than trying to use binary floating point (and worrying about each software library or hardware implementation being different, and giving different answers).

Absolutely, it’s a potentially valid approach for the domain of problems M/MUMPS was designed for. But Julia was designed to have 2 be different from 2.0 because people CARE about the bits and Julia is a language for people who care about such things. For example suppose you’re designing an ASIC that automatically load-balances packets between different ports on a 100Gbps ethernet switch. You want to prototype your algorithm. It works on 48 bit hardware masks that represent a 32 bit IP address and a 16 bit port number (or a 128 bit IPv6 address that has been xored down to a 32 bit field plus a 16 bit port)… if you choose M/MUMPS, R, or Octave to do this prototyping you will be in a WORLD of hurt. Julia will let you do this. If you need to calculate some kind of polynomial ring modulus over these 48 bit fields… julia will do it happily. R won’t without a lot of hassle. Saying that ASIC designers are just SOL and floats are where it’s at doesn’t seem to be what Julia is about.

2 Likes

Hey all, this has been a lively discussion, but I fear that it’s long past feeling like a bit of a pile-on (albeit a very civil one :clap:), with @viraltux being the sole person arguing on one side and all the rest of us on the other side, which just doesn’t feel like a nice dynamic. Some very good points have been made:

  • @viraltux has pointed out, quite rightly, that integer overflow can lead to program errors, which can be a concern and that it would be nice to be able to do something like --check-overflow to make sure that integer overflow errors aren’t happening in programs.
  • It looks like a very limited version of this checking is probably going to happen by default in a future Julia version for Int^Int. This is nice because ^ is one of the operators that is most likely to cause an integer overflow and for which it is unlikely that the user actually wanted native integer modular behavior.
  • Implementing --check-overflow for other arithmetic operations is more difficult because Julia has been ambiguous about whether Int+Int and similar operations are intended to do wrapping arithmetic or not. Before we could usefully introduce overflow checks, we’d need +ₘ etc. operators that are explicitly wrapping.

The rest of the discussion is much more subjective and reasonable people can disagree on how dangerous integer overflow is versus floating point error. But it seems like enough points have been made all around, so lest this seem like too much of a pile-on—and it may be too late, with apologies to @viraltux—let’s call it a day.

53 Likes