In my benchmark, I would like to compare the performance of a more specialized method to a less specialized method. Following Ralph Smith’s suggestion, I’ve tried using the invoke
function for this.
The problem is that the call to invoke
introduces an overhead that dominates by order of magnitude over the function that I actually want to benchmark
My setup looks roughly like this:
T = Complex128; β = 1; N = 4
α, A, B, C = rand(T), rand(T, N, N), rand(T, N, N), rand(T, N, N)
t_super = (Number, AbstractMatrix, AbstractMatrix, Number, AbstractMatrix)
t_args = (typeof(α), typeof(A), typeof(B), typeof(β), typeof(C))
label = "T=$T, β=$β), N=$N"
SUITE["NDM"]["$label - super"] = @benchmarkable(invoke(commutator!, $t_super, $α, $A, $B, $β, Cb), setup=(Cb = copy($C)))
SUITE["NDM"]["$label - invoke"] = @benchmarkable(invoke(commutator!, $t_args, $α, $A, $B, $β, Cb), setup=(Cb = copy($C)))
SUITE["NDM"]["$label - direct"] = @benchmarkable(commutator!($α, $A, $B, $β, Cb), setup=(Cb = copy($C)))
where commutator!
is the function I’d actually like to benchmark and t_super
is the signature of the more general method I’d like to compare against.
The result of this benchmark is this:
ID | time | GC time | memory | allocations |
---|---|---|---|---|
["NDM", "T=Complex{Float64}, β=1, N=4 - direct"] |
627.216 ns (5%) | |||
["NDM", "T=Complex{Float64}, β=1, N=4 - invoke"] |
24.574 μs (5%) | 208 bytes (1%) | 3 | |
["NDM", "T=Complex{Float64}, β=1, N=4 - super"] |
21.289 μs (5%) | 3.25 KiB (1%) | 17 |
“direct” is the benchmark for the commutator!
function without any overhead, “invoke” is the benchmark for the exact same method, called through invoke
, and “super” is the benchmark for the more general method, also called through invoke
. The “direct” time is below the margin of error for the “invoke”/“super” time, which means the result for “invoke”/“super” is completely useless: I’m really only measuring invoke
.
So, my question is whether there’s any way within BenchmarkTools.jl
/PkgBenchmark.jl
to eliminate the overhead of invoke
, or any workaround. I see three possibilities:
-
Something within
BenchmarkTools
akin tosetup
that just does the right thing (do @jrevels or @kristoffer.carlsson have any ideas?) -
Copying the code for the more general
commutator!
method under a new name,commutator_super!
, and call that in the benchmark. This would obviously work, but it’s very inelegant (code duplication), and I’d like to avoid it if at all possible -
Automate possibility (2) via metaprogramming. Something like
commutator_super! = @swap_signature(commutator!, t_args, t_super)
where the@swap_signature
macro would have to be implemented to define a new function that uses the body of thecommutator!
method for thet_super
signature, but with the function signaturet_args
. This seems pretty difficult. The farthest I’ve gotten with this is to get the AST of a function, e.g.function f(x::Int) if (x>0) return x else return -x end end meth = methods(f, (Int, )).ms[end] ast = ccall(:jl_uncompress_ast, Any, (Any, Any), meth, meth.source).code
This gives me an array of Expressions. Is there any way to turn that back into a function? If so, that should actually allow me to implement the
@swap_signature
macro.