TPDE: A Fast Adaptable Compiler Back-End Framework

Researchers from the Technical University of Munich (TUM) have announced TPDE as a fast and adaptable compiler back-end framework.

I wonder if it could be useful to Julia in the future.

6 Likes

Beat me to it… had the same thought. Since something like this works with LLVM at some level, how much effort would be required to “drop it in?”

The primary goal is low-latency compilation while maintaining reasonable (-O0 ) code quality, e.g., as baseline compiler for JIT compilation or unoptimized builds.
Currently, TPDE only targets ELF-based x86-64 and AArch64 (Armv8.1) platforms.

TPDE seems to be a research tool/benchmark, not aimed at production.

1 Like

not really if -O0 stays where it is relative to -O3 in terms of performance.

1 Like

In addition to the limited number of targets, do other unsupported features affect Julia?

  • Targets other than x86-64-v1/AArch64 (ARMv8.1) (Linux) ELF.
  • Code models other than Small-PIC.
  • Scalar types: integer types larger than i64 except i128 (i128 is supported), pointers with non-zero address space, half, bfloat, ppc_fp128, x86_fp80, x86_amx. Code with x86-64 long double needs to be compiled with -mlong-double-64.
  • Vectors: types that are not directly legal on the target (e.g., <32 x i8> on x86-64); icmp/fcmp; pointer element type; getelementptr with vector types; select with vector predicate, integer extension/truncation,
  • select aggregate type other than {i64, i64}.
  • bitcast larger than 64 bit.
  • Atomic operations might use a stronger consistency than required (e.g., always seqcst for atomicrmw).
  • Calling conventions other than the C calling convention (SysV on x86-64, AAPCS on AArch64).
  • fp128: fneg, fcmp one/ueq, many intrinsics.
  • Computed goto (blockaddress, indirectbr).
  • landingpad with non-empty filter clause.
  • Many intrinsics, and some intrinsics are only implemented for commonly used types (e.g., llvm.cttz only for 8/16/32/64-bit).
  • IFuncs.
  • Various forms of constant expressions in global initializers.
  • Non-empty inline assembly.
  • Full asynchronous unwind info (frame info only correct in prologue and at call sites).
  • Several corner cases that we didn’t encounter so far.