Posits - a new approach could sink floating point computation

A perhaps throwaway remark. Intel are promoting FPGAs these days, with FPGAs being bonded into th same package as conventional CPUs. I understand the concept of FPGAs. However I don’t understand how the data bits get across to the FPGA. So saying that “soon we will have reconfigurable CPUS which can be changed to do most any posit operations” is probably well wide of the mark.
I ought to look at how the FPGA is connected - nothing really can beat being encoded in the CPU silicon and also being at the top of the cache hierarchy.
Can anyone comment - are the FPGAs in the same logical place as DRAM - ie outside the caches?

While these are fascinating questions, are you sure that this forum is the best place to discuss them?

Especially in a tangent to a topic that is already pretty much tangential to Julia.

If not here then where? I think the Julia community gathers a higher concentration of people interested in unusual numeric types than anywhere else, partly because of the language’s unique ability to implement them and use them easily, and partly just because of the nature of people using the language. Anyone who doesn’t want to discuss it can always mute this thread :grimacing:


I haven’t been following the latest developments in posit land. Are lookup tables still the suggested—at the time, the only—way of implementing posit arithmetic, or was that only the case for an earlier incarnation of the design?

I was referring to the FPGA sub-topic.

Generally the way these integrations work is that the FPGA part of the chip gets cache coherent access to the CPU bus. I.e. you can use it as a fast on-chip accelerator, but not to implement extra instructions for your CPU (unless you’re willing to stall your CPU cycles while the instruction finished). On the ARM/Xilinx integration side, this generally seems to be an AXI bus, which is fairly standard in that world. I don’t know very much about the Intel version of this, but their marketing suggests that they connect the FPGA on die using UPI, which does have cache coherency support (similar to multiple CPU packages), but I’m not sure to what extent that’s exposed to the programmable logic. You will be perhaps unsurprised to learn that the FPGA industry is riddled with absolutely horrible archaic toolchains, so even if experimentation is possible, it’s certainly not pleasant. I do see some movement in the open source FPGA space these past few months, so I’m somewhat hopeful that this will change in the not too distant future.


Something that Xilinx is working on in particular with their Zynq parts (which integrate ARM cores and FPGA into the same silicon) is a set of tools to allow one to profile software compiled for the ARM and then decide to accelerate it by implementing functions as hardware in the FPGA. They also have the ability in that family to do dynamic reprogramming of the FPGA (portions may be reprogrammed without a system reset or reboot).

It’s a bit of a stretch, or maybe an enormous stretch, but the hardware pieces are there which would allow a flavor of Julia that did JIT compilation to a processor+FPGA combo.


I think that was for “Unums 2.0”: https://ubiquity.acm.org/article.cfm?id=3001758, Posits are (I belive) the third iteration.

1 Like

Yes, Posits do not rely on lookup tables, there are algorithms which would work in hardware very similar to floats, they just need a “count leading 0’s or 1’s”, to decode the regime bits, but as far as I know from the posit hardware people I spoke to (Peter Hofstee, IBM, for example) that’s not a bottle neck.

Concerning lookup tables: I’m currently working on SoftSonum.jl - a self-organizing number format that learns from data. The idea is that you tell your number format what numbers you want to calculate and it will figure out what dynamic range is appropriate and where to put the precision. It is still based on ideas around the posit circle (page 16 ff.). Under the hood it works with lookup tables that have to be precomputed. For 16bit this is on the edge of what’s feasible (requires 4GB of RAM), but for 8bit this is even attractive as a hardware supported lookup table. I’m more planning to use it as a testcase to learn what properties a somewhat “optimal” number format would have and to better understand if and why posits are better than floats in the applications that I came across so far. If anyone wants to contribute to that project - you are very welcome to!


I want to comment on the “Lack of scale invariance” point. What you describe as an advantage for floats, I regard as one of the bigger inefficiencies of floats, so yes, it’s probably a matter of the perspective but let me briefly explain mine: Having an about constant precision across the whole range of representable numbers is nice in a sense that a programmer doesn’t have to worry about where to perform computations. As long as there are no overflows nor underflows occuring you can basically expect the result to be just as good as it gets with floats. This reminds me of the “Just throw double precision at everything”-attitude, which, don’t get me wrong, brought scientific computing very far, as one basically doesn’t have to worry about precision.

However, going down to 32bit, 16bit or even further is a question of optimizing performance/energy/storage etc. and in that sense putting a lot of precision where it’s rarely needed as floats do it, is not an optimal use of your available bit patterns. I would therefore like to rephrase the “Lack of scale invariance”-disadvantage you mention as an advantage for posits: In contrast to floats, posits allow you to rescale* your algorithms to make a more efficient use of higher precision around 1. At the same time, with posits you get a wide range of representable numbers to avoid hitting floatmin/max in case it turns out to be difficult to squeeze parts of your algorithm into a small range.

Yes, this calls for a paradigms shift in which it’s up to the programmer to have a rough idea of the numbers that occur and scale* algorithms accordingly. I’m not talking about programmers that write general-purpose code but the ones that use them for a specific application. Let me explain this by explaining the *s

(*) Rescaling here means any of the following approaches that analytically lead to the same result, but may alter the result significantly when using finite precision arithmetics: Non-dimensionalization, actual scaling with a multiplicative constant, reordering the arithmetic operations, precomputing constants at high precision, and in general avoiding intermediate very large/small numbers.
Example: Say you want to compute energy via the Stefan-Boltzmann law, you could simply write

const σ = 5.67e-8
f(T) = σT^4

However, with Float16 this causes an overflow for T = O(100). One way would be to pull σ into the power 4 and precompute σ^(1/4) (ideally at high precision),

const σ_onefourth = σ^(1/4)
f2(T) = (σ_onefourth*T)^4

Overflow problem solved. Another way would be to say, hmm, T = O(100) is not great so why not Tₛ = s*T, with s = 1/100 to have Tₛ = O(1), then

const σₛ = σ/s^4     # which is 5.67, so O(1)
f3(Tₛ) = σₛ*Tₛ^4

So, you end up with the same number of operations, but all O(1), therefore robust to overflows and you can make use of the higher precision of posits around 1.

1 Like

I do think that’s a really interesting and potentially valuable trade off to explore. However, that is not how posits (and unums before them) have been pitched. Rather, they have been sold as “you don’t have to worry about numerical precision errors anymore”. Whereas the reality is that you have to be even more careful and aware of numerical precision. In exchange for that, you can save significant time, memory and energy. Fair enough. Sometimes that’s a worthwhile trade off. It’s the repeated claims that posits/unums give you something for free with no downside that are problematic. Explicitly giving up scale invariance in exchange for dynamic precision is a potentially very useful trade off, but don’t pretend it’s not a trade off. A format which has a) dynamic precision and b) some way of addressing the representation of operation error precisely and c) a mechanism for tracking what the minimum precision throughout a computation would be very interesting indeed. Perhaps that’s what posits are converging to, but they’re not there yet.


Do you think that time and effort invested in this has better payoff than devoting the same amount of resources to error analysis and algorithm design for numerical stability, sticking to IEEE 754 floating point?

Eg for the calculation above, simply working with logs would be a fairly standard thing to do. Details of course depend on the context.

Scale invariance in Float16 is a bad joke anyway. So for specifically 16 bit numbers, I think the answer is “yes”.

1 Like

Adding hardware “quire” accumulators for IEEE floats would be a killer hardware feature. There’s an ongoing conversation about Python’s fsum and superaccumulators, which are surprisingly fast on modern hardware, with various people trying to optimize them better. It seems like something where a hardware mechanims for getting guaranteed exact sums would be a game changer.


I rarely use Float16 for computation (only for storage, very occasionally), but I am under the impression that most nontrivial computations would need to be expertly designed to get acceptable accuracy in IEEE 754 and also for posits (at least I don’t see any specific feature of posits that allows one to skip this step).

Also, I think that Float16 is kind of a red herring here, as most of the focus seems to be on 32-bit.

A quire is a data type, and when used as an accumulator I think everyone will see the need to make it as register-like as possible. But yes, you can store the quire register in memory and load it from memory. You can also add the contents of a quire value stored in memory to the quire register (and I imagine we need subtraction support, also). I feel good about the practicality of this up to 32-bit posits, for which the quire is 512 bits. (If you attempt to build a quire for 32-bit floats, you’ll find it needs to be 640 bits or thereabouts, an ugly number from an architecture perspective.) If we find that 64-bit posits are needed for some applications, the quire is 2048 bits and that starts to look rather unwieldy.

Although I’m intrigued by the ACRITH and XSC approach, my strong preference is that all use of the quire be explicit in the source code, not automatic. As soon as things become automatic (covert), differences arise in the result. We have a real shot at achieving perfect bitwise reproducibility with posits, if the Draft Standard is followed and we can keep language designers from performing any covert optimizations that can affect the rounding in any way. I’ve hunted down every source of irreproducibility in the way IEEE 754 floats work and corrected those in the Posit Draft Standard. Posits should be as reproducible as integer calculations where the precision (number of bits) in the integer is explicit at all times. If I may be so bold as to offer a meme:



Completely missing from the discussion so far is Gustafson’s Valids, his complement to Posits. It’s similar to interval arithmetic, and should be helpful.

“In February 2017, Gustafson officially introduced unum type III, posits and valids.”

If you get away with fewer bits (half) for Posits compared to regular floats, then Valids are a good option.

See also: https://github.com/JuliaIntervals/IntervalArithmetic.jl that I assume you can use with Posits.

And off-topic (regarding FPGA discussion above), FPGA support seems at least on thee table (but not implemented, and I found Julia library related to):

1 Like

Valids are covered briefly in the “posit4” document on posithub.org. One reason for not going into more detail is because valids are just like Type II unums, expressed as a start-stop pair indicating the arc of the projective real circle that they traverse.

The big advantage of valids (or any other type of unum) over intervals that use IEEE floats as endpoints is that they distinguish between open and closed endpoints. When you want to work with intervals, you’re dealing with sets; you have to be able to intersect, union, and complement sets, but classic interval methods cannot do that because all the endpoints are closed. If you need to express underflow, for example, you use the open interval (0, minreal). Overflow is (maxreal, ∞). Valids give you back all the features like signed infinities and what IEEE 754 calls “negative zero” that posits eliminate to simplify hardware.

A valid is simply a pair of posits, where the last bit of each posit is the ubit (uncertainty bit). If the ubit = 0, the value is exact; if the ubit = 1, the value is the open interval between adjacent exact posits.

As an aside, Kahan’s “scathing” review of unums is riddled with errors and self-contradictions, like much of his unrefereed blogs. He got called on those errors in our debate, in which he also revealed he has only read snippets of The End of Error: Unum Computing, not the whole thing; for example, he claims that the Wrapping Problem is not mentioned in the book. Well, it was left out of the Index by mistake, but it is in the Table of Contents, the Glossary, and at least one Chapter is dedicated to it with full-page illustrations of what causes it and how to deal with it. A transcript of “The Great Debate” with Kahan (which predates the invention of posits) is at http://www.johngustafson.net/pdfs/DebateTranscription.pdf


I’m sure you [all] know about the new IEEE 754-2019.


It has e.g. “The relaxed ordering of NaNs” probably of interest, to Julia gurus (one area where posits win).

It has “new tanPi, aSinPi, and aCosPi operations are recommended” (previously not thought needed), and I’m curious if posits have similar (does it have any trigonometry, or all in libraries?)? I only know of the dotproduct (and a few more?) extra operations posits have standardized vs. regular IEEE.

Also e.g. “5.3.1 {min,max}{Num,NumMag} operations, formerly required, are now deleted” and “9.6 new {min,max}imum{,Number,Magnitude,MagnitudeNumber} operations are recommended; NaN and signed zero handling are changed from 754-2008 5.3.1.” seems interesting vs. posits.

1 Like

Thanks for this update… I was not aware of this latest effort to rearrange the deck chairs on the Titanic. It may be one of the first indications of IEEE 754 attempting to keep up with posits and unums by adopting some of their features. In The End of Error: Unum Computing pp. 156–158 I wrote that switching to degrees as the unit solves the argument reduction problem. In the Draft Posit Standard I mandated tanPi, aSinPi, aCosPi, and other trig functions that treat all angles as an exactly-representable number times π. While traditional cos(x), sin(x) etc. may make calculus look more elegant, it turns math libraries for trig functions into very expensive random number generators for most of the dynamic range.

So many of the changes are now “Recommended.” Standards documents should not have recommendations or suggestions or options. They should have Requirements, and that’s it. IEEE 754 should be called a Guidelines document, not a Standard. It could have more than one level of “compliance” if they wanted to formally define that, but I don’t see them going that direction.

And yes, posits eliminate the kinds of issues that IEEE 754 faces with multiple NaN values, signed zero handling, and round-ties-to-even when both choices are odd. Someday we will look back on such difficulties and laugh.