What is the maximum inlining depth in practice?

Out of professional curiosity: What is the maximum depth for inlining function calls in Julia in practice?

My programming intuition says that there must exist a practical limit regarding how deeply nested functions may be before they are not inlined anymore. Is this a fixed (possibly configurable) number? Or does it depend on some compiler heuristics? And if yes, what are the limits in practice that people have experienced?

I am asking since Julia favors many small functions and we’ve tried to do this in Trixi.jl. However, people coming from other languages (ahem C++ or Fortran ahem) still cringe when they see it, and I’d like to back my claim that small functions make for fast code with some hard numbers during the next discussion :wink:


As mentioned on Slack, I generated 52 functions each which calls the previous, where the final step calls sum (which itself has a call stack 13 functions deep), and that all inlined.

So if there is a limit, it’s at least 66 layers deep - and will probably never be reached in practice.

1 Like

Thanks for your answer! Out of curiosity: How did you generate those functions and how were you able to determine that everything is inlined?

I deleted the code, but it was something along the lines of

@inline a(x) = sum(x)
for i in 'b':'z'
    @eval @inline $(Symbol(i))(foo) = $(Symbol(i - 1))(foo)

I also added another loop with the capital letters. It’s easy to extend this as far as you want. To determine whether it inlined, I did
@code_native z([1])


You could use numbers to make reaching arbitrary depths easier:

julia> @inline a_0(x) = sum(x)
a_0 (generic function with 1 method)

julia> for i in 1:200
           @eval @inline $(Symbol(:a_, i))(x) = $(Symbol(:a_, i-1))(x)

julia> @code_native debuginfo=:none syntax=:intel a_200((1.0,2.0))
        vmovsd  xmm0, qword ptr [rdi]           # xmm0 = mem[0],zero
        vaddsd  xmm0, xmm0, qword ptr [rdi + 8]
        nop     word ptr [rax + rax]
; julia> @code_llvm debuginfo=:none a_200((1.0,2.0))
define double @julia_a_200_1697([2 x double]* nocapture nonnull readonly align 8 dereferenceable(16) %0) {
  %1 = getelementptr inbounds [2 x double], [2 x double]* %0, i64 0, i64 0
  %2 = getelementptr inbounds [2 x double], [2 x double]* %0, i64 0, i64 1
  %3 = load double, double* %1, align 8
  %4 = load double, double* %2, align 8
  %5 = fadd double %3, %4
  ret double %5

In practice, you can hit non-specializing heuristics if you’re not careful.

Do you mean this warning?

Yes. I believe I also hit it using pass-through singleton types, but managed to avoid it via reconstructing them in each layer.