I have been playing with parametric methods to imitate the template programming of, say, C++. I seem to have some performance overhead when I prefer using method recursion to using a loop.
Compare, for instance, the below recursion
julia> function fact1(::Type{Val{N}})::Int where N
n::Int = N
return n*fact1(Val{n-1})
end
fact1 (generic function with 2 methods)
julia> fact1(::Type{Val{0}}) = 1
fact1 (generic function with 2 methods)
julia> @code_llvm fact1(Val{5})
define i64 @julia_fact1_62903(i8**) #0 !dbg !5 {
top:
%ptls_i8 = call i8* asm "movq %fs:0, $0;\0Aaddq $$-10888, $0", "=r,~{dirflag},~{fpsr},~{flags}"() #3
%ptls = bitcast i8* %ptls_i8 to i8****
%1 = alloca [7 x i8**], align 8
%.sub = getelementptr inbounds [7 x i8**], [7 x i8**]* %1, i64 0, i64 0
%2 = getelementptr [7 x i8**], [7 x i8**]* %1, i64 0, i64 3
%3 = getelementptr [7 x i8**], [7 x i8**]* %1, i64 0, i64 5
%4 = getelementptr [7 x i8**], [7 x i8**]* %1, i64 0, i64 2
%5 = bitcast i8*** %2 to i8*
call void @llvm.memset.p0i8.i32(i8* %5, i8 0, i32 32, i32 8, i1 false)
%6 = bitcast [7 x i8**]* %1 to i64*
store i64 10, i64* %6, align 8
%7 = getelementptr [7 x i8**], [7 x i8**]* %1, i64 0, i64 1
%8 = bitcast i8* %ptls_i8 to i64*
%9 = load i64, i64* %8, align 8
%10 = bitcast i8*** %7 to i64*
store i64 %9, i64* %10, align 8
store i8*** %.sub, i8**** %ptls, align 8
store i8** null, i8*** %4, align 8
%11 = getelementptr [7 x i8**], [7 x i8**]* %1, i64 0, i64 6
%12 = getelementptr [7 x i8**], [7 x i8**]* %1, i64 0, i64 4
store i8** inttoptr (i64 140648119871760 to i8**), i8*** %2, align 8
store i8** inttoptr (i64 140648119910736 to i8**), i8*** %12, align 8
%13 = call i8** @jl_f_apply_type(i8** null, i8*** %2, i32 2)
store i8** %13, i8*** %11, align 8
store i8** inttoptr (i64 140648179986664 to i8**), i8*** %3, align 8
%14 = call i8** @jl_apply_generic(i8*** %3, i32 2)
store i8** %14, i8*** %4, align 8
%15 = bitcast i8** %14 to i64*
%16 = load i64, i64* %15, align 16
%17 = mul i64 %16, 5
%18 = load i64, i64* %10, align 8
store i64 %18, i64* %8, align 8
ret i64 %17
}
to the while
-loop equivalent:
julia> function fact2(::Type{Val{N}})::Int where N
n::Int = N - 1
res::Int = N
while n > 1
res *= n
n -= 1
end
res
end
fact2 (generic function with 1 method)
julia> @code_llvm fact2(Val{5})
define i64 @julia_fact2_62904(i8**) #0 !dbg !5 {
top:
ret i64 120
}
Both versions in C++, when compiled with clang++
, emits the same llvm code as the while
-loop version above. However, in julia
, there is a significant difference between using recursive calls and loops in parametric methods.
Is this something that could be fixed? Or, is this something I should remember — as a rule of thumb, should I simply prefer using loops to recursion?
Thanks in advance for your time.