Implict function specialization vs. explict specialization vs. generated functions

benninkrs · April 23, 2017, 5:23am

Is there any difference between these three ways of specializing a function that could affect performance? For instance, suppose we have functions f,g,h defined and called as follows:

f(a) = a+a

@generated g(a) = :(a+a)

h(a::Float64) = a+a
h(a::Int64) = a+a

f(1)
f(1.0)

g(1)
g(1.0)

h(1)
h(1.0)

My understanding is that after all this there will be two compiled versions of each function, a Float64 version and an Int64 version. And I presume the compiled code the same for each of the 3 functions. But in the limit that the functions have many different specializations, is the compilation and dispatch overhead (either in time or memory) similar in all three cases?

fengyang.wang · April 23, 2017, 7:34am

With type stable arguments, f, g, and h are identical, as you surmise.

Here’s a benchmark for dynamic dispatch;

julia> f(a) = a+a
f (generic function with 1 method)

julia> @generated g(a) = :(a+a)
g (generic function with 1 method)

julia> h(a::Float64) = a+a
h (generic function with 1 method)

julia> h(a::Int64) = a+a
h (generic function with 2 methods)

julia> using BenchmarkTools

julia> const dynamic = Number[0, 0.0]

julia> for fn in [f, g, h]
           @btime $fn(dynamic[1]) + $fn(dynamic[2])
       end
  47.345 ns (2 allocations: 32 bytes)
  41.269 ns (2 allocations: 32 bytes)
  40.773 ns (2 allocations: 32 bytes)

Here we see that h is the fastest, followed by g and then f is much slower. In fact f is slower because of a compiler optimization gone wrong: f is simple enough to be inlined, so the compiler does so, and this inlines a dynamic dispatch to +, which is more expensive than dynamically dispatching to g or h which have much simpler method tables. In general this situation is rather rare; inlining doesn’t often hurt performance. So the result of this particular artificial benchmark should be taken with the understanding that in practice, slowdown due to inlining is not very common. If our functions are more expensive, perhaps to the extent that they are no longer feasible to inline, then there will be little difference between f and h (f would be a little faster because of the simpler method table).

benninkrs · April 23, 2017, 7:40pm

@fengyang.wang Thanks for the helpful response. It is good to know that the speed is essentially the same (apart from degenerate cases like the one above). The memory allocation shown in the benchmark appears to be just that needed to hold the function results. But what about the memory of the method definitions and dispatch table? I have seen a few posts (e.g. https://github.com/JuliaLang/julia/issues/7357#issuecomment-277056261, Is mem of compiled/eval'ed functions garbage collected?, and Degrading performance when generating many functions · Issue #18446 · JuliaLang/julia · GitHub) which indicate that defining lots of methods stresses the Julia runtime, with symptoms including high memory use and slightly degraded performance. I am curious as to whether generated functions or compiler specializations would have less overhead since there is (evidently) only one function definition specialized many times, instead of many definitions each specialized once.

Keno · April 23, 2017, 9:40pm

As a rule of thumb, using generated functions for anything that can be accomplished with regular dispatch is a bad idea and the compiler will make you pay for your transgression (in compile time, memory usage, etc.).

Topic		Replies	Views
Why are functions that are never called getting compiled? General Usage compilation , lazy-evaluation , generated	8	626	June 10, 2023
When should we use `@nospecialize`? General Usage question	8	4036	January 17, 2020
Why are generated functions (maybe) impossible to statically compile? Internals & Design	18	1290	March 25, 2023
Performance issue due to function as an argument General Usage question , performance	16	850	September 22, 2023
Performance of value assignment inside functions Performance question	3	303	July 13, 2022

Implict function specialization vs. explict specialization vs. generated functions

Related topics