The reasoning behind our definition of “sample” may not be obvious to all readers. If the time to execute a benchmark is smaller than the resolution of your timing method, then a single evaluation of the benchmark will generally not produce a valid sample. In that case, one must approximate a valid sample by recording the total time t it takes to record n evaluations, and estimating the sample’s time per evaluation as t/n . For example, if a sample takes 1 second for 1 million evaluations, the approximate time per evaluation for that sample is 1 microsecond. It’s not obvious what the right number of evaluations per sample should be for any given benchmark, so BenchmarkTools provides a mechanism (the tune! method) to automatically figure it out for you.
Sonit seems in your case, you should leave evals at the default, 1, and set samples to 7
I’m increasing the seconds to 300 and yes, there is so much room for optimization since I’m a julia rookie. Even with that room, it’s actually blowing the python implementation away (takes less than half the time it does in Python).
I will report what the new benchmark results show.
Unlike @time BenchmarkTools computes statistics (median, mean time etc) and has nice representation of the result. And it’s easier to use one tool than switching from one to another.
You can use BenchmarkTools if preferred, but for long-running calculations where you did a run to compile things already there are diminishing returns to it.
You will get a statistical sample, but it is usually a small one so the benefits are frequently marginal.
That’s great news
I still think you can squeeze some extra performance out of julia by working on the allocations, even though I have no idea what code you are actually benchmarking
I’m cleaning up the code. I will make a post soon for optimization tips since I have never really dealt with low level languages so I have no clue about allocations (but I am eager to learn now)
Great When you make such a post, make sure to include self-contained code such that anyone can just copy and paste it and get a timing result for themselves to improve upon. That’s a great way to ensure that you get a lot of people hooked on making the code faster