Allocations and time of running the same program twice are orders of magnitude larger than running them separately

dieg0 · May 15, 2024, 3:17pm

Yes, this seems to be it. So @benchmark does not seem to be a good way to evaluate the performance of a fixed point algorithm like mine. Writing a wrapper function of the following form (passing the other parameters):

function evaluate(δ_in, Xβ, params)
    δ_out = copy(δ_in)
    invert_shares_δ!(δ_out, Xβ, params)
    return δ_out
end

Gives the “correct” benchmark performance:

BenchmarkTools.Trial: 151 samples with 1 evaluation.
 Range (min … max):  18.512 ms … 378.933 ms  ┊ GC (min … max):  0.00% … 90.32%
 Time  (median):     20.174 ms               ┊ GC (median):     0.00%
 Time  (mean ± σ):   34.223 ms ±  68.247 ms  ┊ GC (mean ± σ):  39.90% ± 18.29%

  █▁                                                            
  ██▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▁▄▅ ▄
  18.5 ms       Histogram: log(frequency) by time       372 ms <

 Memory estimate: 307.06 MiB, allocs estimate: 18968.

This is roughly half of the time than running the code twice together (see my benchmark estimates above). So it makes sense.

Sad news—my code is slower than I thought! I guess I will have to make it faster.

Topic		Replies	Views
Reducing allocations to optimize (to match Java speed) Performance	22	1341	December 10, 2018
Boltzmann Factor in a loop Performance	38	1799	April 3, 2018
Too many allocations? Performance	23	1675	February 18, 2021
Memory allocation and usage of dot notation Performance question	6	397	October 19, 2022
Common allocation mistakes Performance memory-allocation	47	7148	August 21, 2023

Allocations and time of running the same program twice are orders of magnitude larger than running them separately

Related topics