Hey!
I can know the numbers beforehand (just not their frequency). Ideally I would find a solution that can handle up to the max of UInt64
. For a test set, I have the distribution of numbers that looks evenly sampled from the range:
The counts for the numbers are heavily skewed, in fact, 75% of the numbers are <=100
. This is because these numbers originate from DNA data and hence are not random:
(The x-scale is log10)
I’m not sure if I get your second proposal. Could you elaborate as it sounds nice
if you were to sort your list before counting
…as you can start each thread from a different part in the array
I do not really have a list beforehand as I have to read the numbers from a file (or actually parse them as it is text format with some mess in between). I could pass bytes to different threads, parse the numbers, and then maybe do what you propose?