Unstable execution time with high standard error in Julia

Thanks for your reply! I pay attention to the standard deviation because it is one of the criteria I use to measure the stability of an algorithm. In fact, I also record the mean, median, and standard deviation. However, when I see such a large standard deviation, I feel very puzzled about why some cases take so much time. I need to figure out whether it’s an issue with the algorithm or the code.

Standard deviation is probably not a good measure for distributions that are extremely far from gaussian, especially multi-modal like here, though they can give a qualitative impression of β€œlarge variability”.

2 Likes

So you’re using wall-clock runtime as a proxy for something like β€œhow many iteration steps did my adaptive algorithm need to go below the error threshold”?

That’s not bad for quick-and-dirty impressions and for end-to-end plausibility checks, but I would recommend logging that number directly, for many reasons (e.g. it also allows to separate improvements of the algorithm and improvements of the implementation).

2 Likes

In fact, the variance of the number of the iteration is small, while the running time of the whole algorithm is relatively large. This makes me confused, so I want to solve this issue.

Depending on the RAM you have, you can also just turn off the garbage collector during the benchmark to see if that solves the issue:

using BenchmarkTools
GC.enable(false)
@benchmark your_function()
 
GC.enable(true)
1 Like

Thanks for your advice! The variance becomes smaller.
Before GC.enable(false), the result is

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  147.334 ΞΌs …   3.006 ms  β”Š GC (min … max): 0.00% … 93.81%
 Time  (median):     164.062 ΞΌs               β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   174.205 ΞΌs Β± 131.898 ΞΌs  β”Š GC (mean Β± Οƒ):  5.59% Β±  6.78%

                β–β–‚β–ƒβ–ƒβ–β–β–β–‚β–‚β–†β–…β–‡β–ˆβ–ˆβ–‡β–†β–…β–ƒβ–ƒβ–                             
  β–β–β–β–β–β–β–‚β–‚β–‚β–‚β–ƒβ–„β–…β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–†β–†β–…β–…β–„β–ƒβ–„β–ƒβ–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–β–‚β–‚β–β–β–β–β– β–„
  147 ΞΌs           Histogram: frequency by time          186 ΞΌs <

 Memory estimate: 385.34 KiB, allocs estimate: 268.

After GC.enable(false), the result is

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  153.250 ΞΌs …  1.817 ms  β”Š GC (min … max): 0.00% … 0.00%
 Time  (median):     182.520 ΞΌs              β”Š GC (median):    0.00%
 Time  (mean Β± Οƒ):   187.567 ΞΌs Β± 56.816 ΞΌs  β”Š GC (mean Β± Οƒ):  0.00% Β± 0.00%

         β–β–ƒβ–…β–‡β–‡β–ˆβ–†β–†β–…β–„β–…β–„β–†β–…β–„β–‚β–ƒβ–‚                                     
  β–‚β–‚β–ƒβ–…β–†β–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–…β–…β–…β–„β–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–β–‚β–‚β–‚β–‚ β–„
  153 ΞΌs          Histogram: frequency by time          261 ΞΌs <

 Memory estimate: 385.34 KiB, allocs estimate: 268.
1 Like

Other factors that can contribute to the variance:

  • other processes that are running; close all browser windows and VSCode when benchmarking
  • make sure you are not using a laptop that is running on battery, otherwise the powermangement causes variations of the performance
  • if you have a CPU with slow cores and performance cores, try to make sure that only the performance cores are used. This works only on Linux: GitHub - carstenbauer/ThreadPinning.jl: Readily pin Julia threads to CPU-threads

I see. Thanks for your suggestion!