[ANN] PrecisionCarriers.jl: Easy Detection of Floating Point Precision Loss

I recently registered a new package I’ve been working on: PrecisionCarriers.jl. It’s designed to let you check existing code very quickly and easily for loss of precision in its calculations.

I added a macro inspired by BenchmarkTools.jl 's @benchmark macro, called @bench_epsilons. This lets you write a function call, interpolate values from your current scope into it, and define search ranges for variables that are then searched. The output gives a histogram of the observed precision loss, a printout of the number of “infinite” precision losses, if any, and a list of the worst observed precision losses, together with the function calls to reproduce them. Here’s an example of what it currently looks like in a particle physics application:

Internally, this works by wrapping the inputs in a PrecisionCarrier object, which does every computation with both the given value and type, and a BigFloat version of it. At the end of the calculation, the values can be compared, and an estimated amount of machine epsilons of error can be estimated (or the number of remaining significant digits in the result).

Precision is a difficult thing to get right or have a feeling for, and I think it’s helpful to have a good guess about how precise the results of some calculations actually are, especially in scientific code. Contrary to popular belief, there are many more possible ways to introduce precision loss than just catastrophic cancellation or very long chains of arithmetic.

There are a lot of things that can still be improved here, and I have a few ideas in mind already, but I’d also be interested in what others think. Maybe some people want to give it a try and have feedback.

A few packages that I found are similar, but don’t seem intended to actually detect precision lossy code:

  • FloatTracker.jl specifically finds NaN and Inf values
  • DoubleFloats.jl, MultiFloats.jl, ArbFloats.jl, etc. provide floating point types with more precision
  • ValidatedNumerics.jl provides a way to have known error bounds on calculations, which is probably the closest to what I wanted to achieve with my package
16 Likes

I think you might want to use MultiFloats.jl rather than BigFloats for this. The result will be a lot faster.

4 Likes

That might be a good idea, I can look into that. I was also considering using DoubleFloats.jl, because those allow specific backend floating point types to be used, which should allow usage on GPUs, when they don’t have fp64 support (like one api GPUs).

I’ve looked at MultiFloats.jl a bit more, and I think I will not be using it as a backend for now. I don’t think speed is too much of an issue right now, and especially for a package concerned with finding precision errors, I dislike that MultiFloats don’t follow IEEE754 for the sake of speed. They say in their Readme that precision is lost in some cases, even for addition or subtraction (which, unless I’m misunderstanding, to me means that it’s not guaranteed that they round correctly), and they do not correctly propagate Inf and NaN values. This would lead to my package reporting infinite deviation between Inf and NaN, for example, when there might have otherwise been no deviation (Inf vs Inf).

However, I did find something quite fascinating (and initially very confusing) with my go-to unstable function:

function unstable(x, N)
    y = abs(x)
    for i in 1:N y = sqrt(y) end
    w = y
    for i in 1:N w = w^2 end
    return w
end

In theoretical perfect arithmetic, this always yields abs(x) regardless of N, but for any approximation, it will tend towards 1 (usually).
For example, with standard Float64 or Float32:

julia> for i in 1:5000 if unstable(1.5, i) == 1.0 print(i); break; end end
51
julia> for i in 1:5000 if unstable(1.5f0, i) == 1.0 print(i); break; end end
22

For the standard BigFloat:

julia> for i in 1:5000 if unstable(big(1.5), i) == 1.0 print(i); break; end end
254

So for MultiFloats, I was expecting something like ~100 for a Float64x2. Surprisingly:

julia> for i in 1:5000 if unstable(Float64x2(1.5), i) == 1.0 print(i); break; end end
1074

That’s a lot more than I expected. And also, it doesn’t change for the other FloatxN types:

julia> for i in 1:5000 if unstable(Float64x8(1.5), i) == 1.0 print(i); break; end end
1074

I’m assuming this happens because the hi part just approaches 1, while the lo part(s) become smaller and smaller but are not constrained to be the “extended mantissa”, like they are in BigFloat. So this approach only fails when the exponent of the lo part becomes too small.

Kinda cool and maybe interesting for use cases where you need to represent numbers very close to 1.

3 Likes

From the docs:

The given number of epsilons is not the same as an error of measurement. It should not be used for error bars or similar. It’s rather a rough indicator of numerical noise, but for example it can statistically happen that imprecisions cancel each other for certain cases, but this does not indicate better stability.

:thinking:

What is the rationale for measuring the precision loss at each step of each calculation, instead of just calculating the error for the result?

I’m not sure what you mean, the error is only calculated when you call epsilons or significant_digits on a PrecisionCarrier, which is usually only happening when the user asks for it.

I did have the plan to offer something like an @inspect macro that lets you input a specific input sample that results in bad precision, that gives something like a flamegraph or stack trace to find where exactly the precision is lost. This would then involve calculating the precision at each step. But I couldn’t so far think of a way to do it that would make me happy.