I pulled out the fib benchmark, using both @btime from BenchmarkTools.jl and the older @timeit macro from the Julia Microbenchmark.
If I use Clang (Apple clang version 17.0.0), which like Julia uses LLVM, then Julia (version 1.12.5, libLLVM-18.1.7) is still about 30% slower than C:
Julia, Julia Cint, and C (via @btime):
12.750 μs (0 allocations: 0 bytes)
12.791 μs (0 allocations: 0 bytes)
9.875 μs (0 allocations: 0 bytes)
Julia, Julia Cint, and C (via @timeit, times in ms):
julia,Julia Int,0.013416
julia,Julia Cint,0.014500
julia,C,0.010834
(this is on an Apple M4).
Code
using Test, BenchmarkTools
fib(n) = n < oftype(n, 2) ? n : fib(n - oftype(n, 1)) + fib(n - oftype(n, 2))
@test fib(20) == 6765
@test fib(Cint(20))::Cint == 6765
using BenchmarkTools
import Libdl
const Clib = "libfib"
Cfib = """
int fib(int n) {
return n < 2 ? n : fib(n-1) + fib(n-2);
}
"""
open(`cc -fPIC -O3 -xc -shared -o $(Clib * "." * Libdl.dlext) -`, "w") do f
print(f, Cfib)
end
cfib(n) = @ccall Clib.fib(n::Cint)::Cint
@test cfib(20) == 6765
const NITER = 5
using Printf: @printf
macro timeit(ex, name)
quote
t = zeros(NITER)
for i = 0:NITER
e = 1000 * (@elapsed $(esc(ex)))
if i > 0
t[i] = e # i == 0 is JIT warmup
end
end
@printf "julia,%s,%f\n" $(esc(name)) minimum(t)
GC.gc()
end
end
println("Julia, Julia Cint, and C (via @btime):")
@btime fib(20)
@btime fib(Cint(20))
@btime cfib(20)
println("\nJulia, Julia Cint, and C (via @timeit, times in ms):")
@timeit fib(20) "Julia Int"
@timeit fib(Cint(20)) "Julia Cint"
@timeit cfib(Cint(20)) "C"
By “non-recursive” presumably it is using explicit stack management? That is “fair” — it is still using the same exponential-time algorithm, just via an optimized implementation.
On my machine it is about 1.7× faster than clang if I use gcc 15.2.0 to compile the C code:
Julia, Julia Cint, and C (via @btime):
12.750 μs (0 allocations: 0 bytes)
12.750 μs (0 allocations: 0 bytes)
5.701 μs (0 allocations: 0 bytes)
Julia, Julia Cint, and C (via @timeit, times in ms):
julia,Julia Int,0.014292
julia,Julia Cint,0.015333
julia,C,0.006459
making Julia about 2.2× slower than gcc.
Can anyone replicate this using the code above?