The following function:
julia> f(x)=x^5, x^3
f (generic function with 1 method)
is compiled into:
julia> @code_llvm f(2)
; Function f
; Location: REPL[10]:1
define void @julia_f_35717([2 x i64]* noalias nocapture sret, i64) {
top:
; Function literal_pow; {
; Location: none
; Function macro expansion; {
; Location: none
; Function ^; {
; Location: intfuncs.jl:220
%2 = call i64 @julia_power_by_squaring_22857(i64 %1, i64 5)
;}}}
; Function literal_pow; {
; Location: intfuncs.jl:244
; Function *; {
; Location: operators.jl:502
; Function *; {
; Location: int.jl:54
%3 = mul i64 %1, %1
%4 = mul i64 %3, %1
;}}}
%.sroa.0.0..sroa_idx = getelementptr inbounds [2 x i64], [2 x i64]* %0, i64 0, i64 0
store i64 %2, i64* %.sroa.0.0..sroa_idx, align 8
%.sroa.2.0..sroa_idx1 = getelementptr inbounds [2 x i64], [2 x i64]* %0, i64 0, i64 1
store i64 %4, i64* %.sroa.2.0..sroa_idx1, align 8
ret void
}
The x^5 is computed by a call to julia_power_by_squaring while x^3 is computed inline by first computing x^2.
Because x^3 and x^2 are already available, it should be better to get x^5 inline with an additional multiplication rather than doing a call to julia_power_by_squaring.
Of course I can fix it by writing the multiplications explicitly:
julia> function f(x)
x2=x*x
x3=x2*x
x5=x3*x2
x5,x3
end
f (generic function with 1 method)
julia> @code_llvm f(2)
; Function f
; Location: REPL[12]:2
define void @julia_f_35721([2 x i64]* noalias nocapture sret, i64) {
top:
; Function *; {
; Location: int.jl:54
%2 = mul i64 %1, %1
;}
; Location: REPL[12]:3
; Function *; {
; Location: int.jl:54
%3 = mul i64 %2, %1
;}
; Location: REPL[12]:4
; Function *; {
; Location: int.jl:54
%4 = mul i64 %3, %2
;}
; Location: REPL[12]:5
%.sroa.0.0..sroa_idx = getelementptr inbounds [2 x i64], [2 x i64]* %0, i64 0, i64 0
store i64 %4, i64* %.sroa.0.0..sroa_idx, align 8
%.sroa.2.0..sroa_idx1 = getelementptr inbounds [2 x i64], [2 x i64]* %0, i64 0, i64 1
store i64 %3, i64* %.sroa.2.0..sroa_idx1, align 8
ret void
}
but shouldn’t the compiler take care of this?