Why does BenchmarkTools `@belapsed` make so many allocations?

maxkapur · March 25, 2022, 12:37am

For a long time, I have been hard-coding my own benchmarks using expressions like minimum(@elapsed f() for _ in 1:samples). It was brought to my attention that BenchmarkTools.jl does this kind of repeated sampling automatically, so I have been trying to switch over. However, BenchmarkTools.jl seems to incur much higher memory usage in a way that limits the size of benchmark study I can accomplish using my computer.

Why does the BenchmarkTools @belapsed macro cause so many allocations in the example below? Is there a way to prevent this?

julia> using BenchmarkTools

julia> BenchmarkTools.DEFAULT_PARAMETERS.samples = 1
1

(Here I have set the samples to 1 so that the BenchmarkTools macro is, in effect, equivalent to the normal @elapsed macro.)

julia> x = 5.0
5.0

julia> @time @elapsed sin(x)
  0.000004 seconds (1 allocation: 16 bytes)
3.703e-6

julia> @time @belapsed sin(x)
  0.579649 seconds (549.22 k allocations: 10.147 MiB, 91.95% gc time, 5.33% compilation time)
2.4577733199598795e-8

julia> @time @belapsed sin($x)
  0.552421 seconds (44.52 k allocations: 2.397 MiB, 94.37% gc time, 4.06% compilation time)
1.0152152152152152e-8

Second run to allow compilation latency:

julia> @time @elapsed sin(x)
  0.000005 seconds (1 allocation: 16 bytes)
4.009e-6

julia> @time @belapsed sin(x)
  0.558569 seconds (545.59 k allocations: 9.939 MiB, 92.03% gc time, 4.72% compilation time)
2.4442326980942827e-8

julia> @time @belapsed sin($x)
  0.551201 seconds (44.52 k allocations: 2.397 MiB, 93.56% gc time, 4.69% compilation time)
9.18018018018018e-9

Similar but with five samples:

julia> BenchmarkTools.DEFAULT_PARAMETERS.samples = 5
5

julia> @time minimum(@elapsed sin(x) for _ in 1:5)
  0.032541 seconds (72.55 k allocations: 3.968 MiB, 99.26% compilation time)
4.1e-8

julia> @time @belapsed sin(x)
  0.540074 seconds (549.64 k allocations: 10.001 MiB, 92.67% gc time, 4.19% compilation time)
2.2464393179538614e-8

julia> @time @belapsed sin($x)
  0.544774 seconds (44.58 k allocations: 2.399 MiB, 93.15% gc time, 4.74% compilation time)
8.513513513513514e-9

Elrod · March 25, 2022, 4:01am

BenchmarkTools.jl also leaks memory.
LoopVectorization.jl’s benchmarks would leak about 20G of memory by the time they’re done.
So my workaround was to use Distritbued, run benchmarks in worker processes, and then periodically rmproc(workers()) to free the memory and addprocs to replace the workers.

Of course, you could argue that this makes it less convenient than running your own benchmark with repeated @elapsed.

maxkapur · March 25, 2022, 5:07am

That’s my feeling too XD. It seems like BenchmarkTools.jl is really useful for if you want to do a quick A/B of two different functions, but to “benchmark” a whole package where you want to compute specific statistics over the computation times, manipulate the input sizes, and organize the whole thing in a DataFrame or table, hand-coding seems like the way to go.

Topic		Replies	Views
Way to return the number of allocations? General Usage	10	1183	August 2, 2017
Spurious allocation? General Usage question	6	1291	May 16, 2017
BenchmarkTools.jl: @benchmark has big overhead Performance	6	660	October 21, 2020
Allocations in function timing Performance	2	569	October 9, 2018
Memory leak with BenchmarkTools? General Usage question	6	699	November 19, 2019

Why does BenchmarkTools `@belapsed` make so many allocations?

Related topics