Benchmark macro with BenchmarkTools

I’m benchmarking my package DiscreteChoiceModels.jl and comparing against other packages in R and Python. A significant entry point to the package is a macro which implements a domain-specific language to specify models. Ideally, when benchmarking, I’d like to include the time to evaluate and compile the macro since in real-world use every model estimation will require evaluating the macro-but I don’t want to include compilation time for the functions used in model estimation. So I have two questions:

  1. If I use a macro inside an expression in @benchmarkable, is it evaluated for each sample, or is it evaluated outside of the benchmarked code?
  2. If it is evaluated outside the code, is there a way to force evaluation inside the benchmark?

I think you should benchmark the macro expansion itself. Something like this:

# a toy macro that takes time doing nothing useful
macro long_to_expand(expr)
    v = [x^3 for x in rand(10000)]
    return esc(expr)
julia> @macroexpand @long_to_expand 1 + 1
:(1 + 1)

# benchmarking a function using the macro yields nothing useful
# in this case
julia> f(x) = @long_to_expand x*2
f (generic function with 1 method)

julia> @btime f(42)
  0.015 ns (0 allocations: 0 bytes)

# but benchmarking the macro expansion itself should reflect the time spent
# in the expansion process
julia> g() = @macroexpand @long_to_expand x*2
g (generic function with 1 method)

julia> @btime g()
 17.384 μs (15 allocations: 156.89 KiB)
:(x * 2)

I’d think it’s primary cost arises during first execution (i.e. compile time ). Anything like @btime shouldn’t report it, in my understanding. So what is your problem with compile time here?

The macro defines a domain-specific language for specifying a model, so it needs to be expanded and compiled for every model. The expansion and compilation is pretty quick - not something that’s going to be a problem in everyday use - but I want to give a fair comparison against packages in other languages that don’t have a compilation hit for every model.

1 Like