Revisiting saturating intrinsics

Hi all.

Here is the origin of the discussion I want to have.

I had to leave the development a few years ago due to somewhat misfortunes. I apologize for the inconvenience.

Unfortunately, no progress has been made in FixedPointNumbers or CheckedArithmetic over the last few years regarding arithmetic.
Rather, the LoopVectorization (VectorizationBase) glow of hope is about to fade.

Given this situation, I beleave that functions like saturating_add (or saturated_add) are better defined in julia’s Base.
The reason they should be under Base is that they should correspond closely to LLVM’s saturating intrinsics.

function saturating_add(x::T, y::T) where {T <: Integer}
    clamp(widen(x) + widen(y), T)

seems to optimize to the correct LLVM saturating intrinsics.

julia> @code_llvm debuginfo=:none saturating_add(1,2)
define i64 @julia_saturating_add_486(i64 signext %0, i64 signext %1) #0 {
  %2 = call i64 @llvm.sadd.sat.i64(i64 %0, i64 %1)
  ret i64 %2

That’s right.
The essence of this issue is not the implementation of saturating_* but where the definition should be.

More to the point, underlying my idea is the hope that if julia or the package maintainers are missing, LLVM will do it for good. :laughing:

Yeah, I don’t know if it is easier in the long run to add new intrinsics to Base or somehow ensure the clamp widen optimization gets applied correctly in future LLVM/Julia versions.

1 Like

No matter how smart julia or LLVM gets, there should always be definitions of what to call them.
For example, the function (not functionality) of saturating arithmetic for types in Dates should be the same as the function of saturating arithmetic for Integers.
(I doubt that saturating arithmetic should be implemented in Dates, though.)

1 Like

Imho saturating_add would be a good fit for Base.

1 Like

Saturating versions of the basic arithmetic functions for floating-point types becomes increasingly important to machine learning.

1 Like

Interestingly clamp(widen(x) + widen(y), T) fails to optimize broadcasting UInt vectors.

function bar(x::T, y::T) where T
    r, f = add_with_overflow(x, y)
    f ? (signbit(y) ? typemin(T) : typemax(T)) : r

is 2x faster when broadcasting UInt vectors in Julia 1.10.2.


Even if julia supports saturating arithmetic in the future, an external package is needed to supplement its functionality to support versions prior to v1.11.

VectorizationBase.jl was one of the candidates, but as noted above, we now consider it inappropriate.
I had CheckedArithmeticCore.jl as a candidate for that, but now there is another candidate OverflowContexts.jl