Adding hardware “quire” accumulators for IEEE floats would be a killer hardware feature. There’s an ongoing conversation about Python’s fsum and superaccumulators, which are surprisingly fast on modern hardware, with various people trying to optimize them better. It seems like something where a hardware mechanims for getting guaranteed exact sums would be a game changer.
5 Likes