Small, fixed powers

scheinerman · May 28, 2021, 1:32pm

Consider these two functions for computing the 4-th power:

power4(x) = x^4
power31(x) = x * x^3

The second is much speedier than the first because (looking at code_llvm or code_native) the first calls a general helper function and the second just does two multiplies. Indeed, the second is intelligently compiled and essentially does this:

y = x*x
return y*y

Running Julia with -O3 optimization doesn’t help.

It appears that fixed exponents from -1 to 3 are handled well. Consider this:

julia> code_llvm(x -> x^3 * x^3 * x^3, (Int,))
;  @ REPL[9]:1 within `#15'
define i64 @"julia_#15_176"(i64 signext %0) {
top:
; ┌ @ operators.jl:560 within `*' @ int.jl:88
   %1 = mul i64 %0, %0
   %2 = mul i64 %1, %1
   %3 = mul i64 %2, %0
   %4 = mul i64 %3, %2
; └
  ret i64 %4
}

That’s terrific: Computing x^9 is reduced to four multiplies.

It may be worthwhile for exponentiation by small, fixed exponents (larger than 3) be compiled directly into multiplications (including for other types such as matrices and polynomials).

stevengj · May 28, 2021, 1:42pm

This is totally possible to do — see e.g. inline ^ for literal powers of numbers by stevengj · Pull Request #20637 · JuliaLang/julia · GitHub — but it is slightly less accurate so the argument was that it should only be done in @fastmath mode or similar.

It would be easy to put this into a package, with a @fastpow macro that turns literal powers in a block of code into optimal addition-chain multiplications (by adapting the PR above), for example.

scheinerman · May 28, 2021, 1:57pm

I didn’t know about the @fastmath macro. Thanks.

stevengj · May 28, 2021, 2:33pm

To be clear, the @fastmath macro doesn’t necessarily do optimal addition-chain exponentiation as far as I know? It switches from the llvm.pow intrinsic to llvm.powi (at least for hardware floating-point types, not for complex numbers etcetera), so it depends on what powi does.

For fun, I just posted a package with a @fastpow macro that implements optimal (fewest-multiply) exponents for literal integer powers:

Skoffer · May 28, 2021, 2:40pm

FastPow link is not working

stevengj · May 28, 2021, 2:41pm

Should be fixed now, sorry.

Topic		Replies	Views
Should `literal_pow` optimize for bigger exponents? Internals & Design	15	796	February 10, 2022
Compiler optimizations on integer exponentation Performance	3	402	July 7, 2021
Power function not inlined optimally Performance llvm , compilation	5	1556	January 4, 2019
Speed of power "^" Performance	16	996	May 6, 2022
Slow arbitrary base exponentiation, a^b Performance	22	2782	June 14, 2020

Small, fixed powers

Related topics