Small, fixed powers

Consider these two functions for computing the 4-th power:

power4(x) = x^4
power31(x) = x * x^3

The second is much speedier than the first because (looking at code_llvm or code_native) the first calls a general helper function and the second just does two multiplies. Indeed, the second is intelligently compiled and essentially does this:

y = x*x
return y*y

Running Julia with -O3 optimization doesn’t help.

It appears that fixed exponents from -1 to 3 are handled well. Consider this:

julia> code_llvm(x -> x^3 * x^3 * x^3, (Int,))
;  @ REPL[9]:1 within `#15'
define i64 @"julia_#15_176"(i64 signext %0) {
top:
; ┌ @ operators.jl:560 within `*' @ int.jl:88
   %1 = mul i64 %0, %0
   %2 = mul i64 %1, %1
   %3 = mul i64 %2, %0
   %4 = mul i64 %3, %2
; └
  ret i64 %4
}

That’s terrific: Computing x^9 is reduced to four multiplies.

It may be worthwhile for exponentiation by small, fixed exponents (larger than 3) be compiled directly into multiplications (including for other types such as matrices and polynomials).

1 Like

This is totally possible to do — see e.g. https://github.com/JuliaLang/julia/pull/20637 — but it is slightly less accurate so the argument was that it should only be done in @fastmath mode or similar.

It would be easy to put this into a package, with a @fastpow macro that turns literal powers in a block of code into optimal addition-chain multiplications (by adapting the PR above), for example.

4 Likes

I didn’t know about the @fastmath macro. Thanks.

To be clear, the @fastmath macro doesn’t necessarily do optimal addition-chain exponentiation as far as I know? It switches from the llvm.pow intrinsic to llvm.powi (at least for hardware floating-point types, not for complex numbers etcetera), so it depends on what powi does.

For fun, I just posted a package with a @fastpow macro that implements optimal (fewest-multiply) exponents for literal integer powers:

https://github.com/JuliaMath/FastPow.jl

7 Likes

FastPow link is not working

Should be fixed now, sorry.

1 Like