In my benchmark, I would like to compare the performance of a more specialized method to a less specialized method. Following Ralph Smith’s suggestion, I’ve tried using the invoke function for this.
The problem is that the call to invoke introduces an overhead that dominates by order of magnitude over the function that I actually want to benchmark
My setup looks roughly like this:
T = Complex128; β = 1; N = 4
α, A, B, C = rand(T), rand(T, N, N), rand(T, N, N), rand(T, N, N)
t_super = (Number, AbstractMatrix, AbstractMatrix, Number, AbstractMatrix)
t_args = (typeof(α), typeof(A), typeof(B), typeof(β), typeof(C))
label = "T=$T, β=$β), N=$N"
SUITE["NDM"]["$label - super"] = @benchmarkable(invoke(commutator!, $t_super, $α, $A, $B, $β, Cb), setup=(Cb = copy($C)))
SUITE["NDM"]["$label - invoke"] = @benchmarkable(invoke(commutator!, $t_args, $α, $A, $B, $β, Cb), setup=(Cb = copy($C)))
SUITE["NDM"]["$label - direct"] = @benchmarkable(commutator!($α, $A, $B, $β, Cb), setup=(Cb = copy($C)))
where commutator! is the function I’d actually like to benchmark and t_super is the signature of the more general method I’d like to compare against.
The result of this benchmark is this:
| ID | time | GC time | memory | allocations |
|---|---|---|---|---|
["NDM", "T=Complex{Float64}, β=1, N=4 - direct"] |
627.216 ns (5%) | |||
["NDM", "T=Complex{Float64}, β=1, N=4 - invoke"] |
24.574 μs (5%) | 208 bytes (1%) | 3 | |
["NDM", "T=Complex{Float64}, β=1, N=4 - super"] |
21.289 μs (5%) | 3.25 KiB (1%) | 17 |
“direct” is the benchmark for the commutator! function without any overhead, “invoke” is the benchmark for the exact same method, called through invoke, and “super” is the benchmark for the more general method, also called through invoke. The “direct” time is below the margin of error for the “invoke”/“super” time, which means the result for “invoke”/“super” is completely useless: I’m really only measuring invoke.
So, my question is whether there’s any way within BenchmarkTools.jl/PkgBenchmark.jl to eliminate the overhead of invoke, or any workaround. I see three possibilities:
-
Something within
BenchmarkToolsakin tosetupthat just does the right thing (do @jrevels or @kristoffer.carlsson have any ideas?) -
Copying the code for the more general
commutator!method under a new name,commutator_super!, and call that in the benchmark. This would obviously work, but it’s very inelegant (code duplication), and I’d like to avoid it if at all possible -
Automate possibility (2) via metaprogramming. Something like
commutator_super! = @swap_signature(commutator!, t_args, t_super)where the@swap_signaturemacro would have to be implemented to define a new function that uses the body of thecommutator!method for thet_supersignature, but with the function signaturet_args. This seems pretty difficult. The farthest I’ve gotten with this is to get the AST of a function, e.g.function f(x::Int) if (x>0) return x else return -x end end meth = methods(f, (Int, )).ms[end] ast = ccall(:jl_uncompress_ast, Any, (Any, Any), meth, meth.source).codeThis gives me an array of Expressions. Is there any way to turn that back into a function? If so, that should actually allow me to implement the
@swap_signaturemacro.