In a Nim vs D thread, they said gcc is better at optimizing recursive functions than LLVM.
This recursion unpacking/unrolling trick that gcc does (at call-site if insulated by call via volatile function ptr, and always inside the recursive impl) is, in my experience, a rare compiler optimization, but maybe it will catch on. clang does neither. If you objdump -D the executable (or disassemble in gdb/etc.) you will see the single callq to Fibonacci at the entry with the full N and a pair of callq inside the impl. So with clang the full 1.618**n work happens. On my i7-6700K Linux with gcc-7.1.0&clang-4.0.1, I get a time ratio of about 15.3 to 1 (53.5s/3.5s). -cblake
Nim compiles into C or C++, and then you can choose the C/C++ compiler from there. D uses LLVM.
Nim with gcc is far faster at fibonacci than D or Nim + Clang.
Another user reported (to which “cblake” was responding) :
first, for cc = gcc in nim.cfg
$ time ./a_gcc.exe
Then use cc = clang in nim.cfg
$ time ./a_clang.exe
Thats 5 seconds for gcc vs 2 minutes for Clang.
The difference we see with Julia is much smaller.
Odds are C would look more similar to Julia if Clang were used instead of gcc.
FWIW, in my experience, LLVM is far better at auto-vectorizing with avx than gcc.
This often makes it easier to get performance out of Julia when just crunching numbers.