I really need to complete my second book, one that focuses just on posit arithmetic and does a better job of showing both the advantages and disadvantages of posits compared to floats. I’ve never cherry-picked examples to make posits look good.

For most situations, I don’t want the compiler to do anything to transform code that uses posits. I was making a reference to the technique used in ACRITH, PASCAL-XSC, and C-XSC where a basic block with plus-minus-times-divide is converted to a lower triangular system of equations; that system can then be solved to within 0.5 ULPs using residual correction, by using the quire to evaluate the residual with only 0.5 ULPs maximum error. Kulisch and colleagues did great work automating the process and getting IBM to commercialize it, but they went after super-high precision for which there is really not much of a market. Had they instead shown that 32-bit or 16-bit representation could safely replace 64-bit representation and thereby saved time/energy/power/storage/etc., I think it might have gotten some traction.

The Marc Reynolds piece is quite hilarious (intentionally), peppered with Don Rickles-like insults. It is relatively free of mathematical or factual errors. The same cannot be said for the work coming out of INRIA, though I went out of my way to make sure they got a paper into our Conference on Next-Generation Arithmetic (CoNGA) last March. The long-established theorems about floating-point error all depend on excluding some of the range. Many theorems fail for subnormal floats, and if multiplication or division are involved, they exclude HALF of all possible inputs to avoid overflow and underflow! If you apply similar exclusions to posits, you find they satisfy even stronger theorems and have provably higher accuracy over the allowed range. The statement that posits can produce huge relative errors when they get into the very large magnitude range where there are few bits of significand is true; it is equally true that floats can produce infinite relative errors, for a very large part of the set of all possible inputs (about 25 percent when computing a product), making them a far weaker stand-in for real numbers.

The main author, Florent de Dinechin, is a brilliant guy and while critical (like Kahan), he’s intrigued by posits. But he makes some howling errors, like a recent paper where he says posit hardware has to test for the 0 and NaR exception, but IEEE floats do not, so he claims float hardware is cheaper. I figured out what he means: He assumes all exception cases are handled in software! Which works, but it’s two orders of magnitude slower and also provides a security hole like Meltdown and Spectre. So please read de Dinechin’s work with a grain of salt.