Poll: speed vs accuracy for `Float64^-3`

In Julia 1.8, we are getting a pure Julia pow function, and as discussed in fix precision issue in Float64^Float64. by oscardssmith · Pull Request #44529 · JuliaLang/julia · GitHub, part of this involved fixing a bug in x^-3.
Before 1.8, Julia’s Literal pow meant that 10^-3 = 0.0010000000000000002 even though 10.0^-3 = .001. Since literal pow is supposed to always return the same result as ^, this is a bug that we needed to fix for 1.8

There are 2 possible meanings for x^-3. The first is inv(x)*inv(x)*inv(x). The second is to return the floating point value closest to big(x)^-3. The first is about 3x faster, (about 2ns compared to 6ns), but is slightly less accurate. For example, the first approach returns 0010000000000000002 for 10^-3, while the second returns .001. Given this, which would you prefer?

  • Slow but more correct
  • Fast but less correct

0 voters

Note that the decision only affects exponents of -3. All others have a clearly correct choice.


I don’t think I have a considered view as this is unlikely to ever going to matter for me, but is there some guiding principle to follow? One example would be integer overflow, where speed is chosen over security (leading to tons of newbie complaints) on the basis that people can always opt in to the “safe”/“correct” behaviour explicitly.

This would favour the inv approach here, with users opting for big where needed explicitly. Of course this tradeoff will always depend on how “bad” the inaccuracy is vs. how big the speed hit is, so maybe the relative cost/benefit here is very different than for integer overflow.

1 Like

Floating-point arithmetic should not use BigFloats (i.e. MPFR) internally. In addition to that being a quite clunky solution, it potentially opens up some problems with thread safety


Why isn’t falling back to floating-point pow on the table? If we want the same results maybe we should just run the same method.

1 Like

I think you misunderstand the role of BigFloat in this context. The question is whether to match the result of a BigFloat computation, not to perform any computations with BigFloat. Had the latter been an option we wouldn’t be discussing 2ns versus 6ns.


So is this the current behaviour?

1.7.2> foo(x) = x^(-3)
foo (generic function with 1 method)

1.7.2> @btime foo(x) setup=(x=10rand())
  63.432 ns (0 allocations: 0 bytes)

So, we are talking about either fixing an inaccuracy and getting a 10x speedup, or not fixing the inaccuracy and getting a 30x speedup?

I think that by choosing the first alternative, we actually have it both ways. If you need to squeeze out the last nanoseconds, you would just do 1/x^3 instead of x^(-3).

Based on this: Definitely alternative 2.

You should actually call the alternatives:

  • More accurate and much faster than now.
  • Same inaccuracy as before but even faster.
    (Note, I changed the order of the alternatives, for language reasons.)

Thanks you for explaining. In that case, I don’t even understand the question. The result should be as accurate as possible (unless I use something like @fastmath)


I’m fairly certain IEEE 754 - Wikipedia has something to say on the question. The Base mathematics should be IEEE compliant…


I’m not sure if IEEE 754 covers powers. The wikipedia page only lists

  • Arithmetic operations (add, subtract, multiply, divide, square root, fused multiply–add, remainder)

so there may be some leeway.

This would also be a great opportunity to decide this in general for all of Base and the standard libraries, at least at the level of a “soft” requirement that we don’t deviate from unless there is a really compelling reason.

Practically, I would suggest always going for the more accurate implementation unless the alternative is significantly faster. Eg for me to give up 4\varepsilon of precision, I would expect at least a 20x performance increase. (Of course I just made these numbers up, share your own preferences).


As a user, I would like to always have a choice.

I voted for accurate on the basis that inaccuracy for speed should be opt-in. Which, of course, contradicts using Float64 but lets not get into that :slight_smile: .

Although, tbh, it seems to be rather insignificant for my work

You may easily get an overflow in this case (especially if x is Int). I think if inv(x)*inv(x)*inv(x) is guaranteed to be within 1 ulp in most cases, then the faster option will be better. In other cases where accuracy is the most important, one can opt for big() and friends, or check their algorithm for numerical stability/sensitivity.

1 Like

@DNF has a point: if it isn’t slower than before, we should prefer accuracy.

1 Like


Language standards should define, to be implemented according to this subclause, as many of the operations in Table 9.1 as is appropriate to the language. As with other operations of this standard, the names of the operations in Table 9.1 do not necessarily correspond to the names that any particular programming language would use. In this subclause the domain of an operation is that subset of the affinely extended reals for which the corresponding mathematical function is well defined. A conforming operation shall return results correctly rounded for the applicable rounding direction for all operands in its domain.

# Numba
In [23]: @njit
    ...: def f(x): return x ** -3
In [24]: f(10.0)
Out[24]: 0.001

# Rust
println!("{}", 10.0_f64.powi(-3));

Or (1/x)^3, it was just an example.

Going a tiny extra step to get more performance over accuracy is reasonable. Having to use big to get accuracy is way over the top, imho, not to mention that it’s >100x slower than the “‘slow’ but accurate” alternative.


Actually, how would you get the accurate version, if we go for the ‘fast path’?

It seems that if we go the ‘slow’ path, then trading performance and accuracy will be easy, and fast either way.

If we go the ‘fast path’, how easy is it to trade off accuracy and speed?


The false dichotomy in the poll should be corrected. please make a correct poll instead, if needed with implementation details. The big(...) is misleading.


The Julia Float64 type is already defined as an IEEE 754 number. So question cannot be to deviate from the standard, rather the question is: How strict is the standard w.r.t. this operation (x¯³)?



Fortran, Octave, and Python, as far as I tested, all return exactly 0.001 for 10.0^(-3), thus I guess it wouldn’t be very nice if Julia returned something different for that floating point operation, in which case the literal power should return that as well.