Repeatedly multiplying a Float64 is fastest in chunks of <51 terms

lwabeke · March 26, 2020, 2:16pm

I would suspect the btime is lying. (Not lying in terms of the micro-benchmark, but misrepresenting what you would see in general use)

I happen to read some of this post, just before yours:
PSA: Microbenchmarks remember branch history

I suspect that with the small code size for the very simplistic micro-benchmark of calling the same function with the same inputs the whole time is skewing what your seeing.

Granted there might be some value in the argument based on number of available registers which might make small chunks below a certain cut-off better.

I would be more inclined to believe a test where you:
create an array of 1000 acceptable inputs
random permute this list
create a function that in a for loop that runs through this list of inputs
time or even btime the function

Hopefully that should provide enough variability to prevent unrealistic tricks and give a more realistic representation of what you would see in practice when using it on different inputs for every call.

Topic		Replies	Views
ANN: Anyone need my 2.7x faster factorial (or gamma) function? Performance	22	2576	January 23, 2019
Ironic observation about `sort` and `sortperm` speed for "small integers" vs R Performance sort , sortperm , r	32	4788	February 4, 2018
Memory consumtion Rational{BigInt} on bernoulli number example Performance	14	901	January 27, 2020
Help with speeding up this code Performance	31	1617	July 26, 2023
Compare permutaion of two numbers efficiently? Performance	32	992	August 4, 2020

Repeatedly multiplying a Float64 is fastest in chunks of <51 terms

Related topics