Fast sampling from discrete distributions


#1

I’m writing some population simulations that make heavy use of categorical/multinomial sampling. I can see from the code that the categorical and multinomial samplers use the alias table method.

In this multinomial, the alias method is only used when n^2 < k where n is the number of trials and k is the number of possible events.

At some point I think that @johnmyleswhite added this cutoff; I was wondering what the rationale for it was?

Also, I wonder if there are some faster methods available. For example, https://arxiv.org/pdf/1611.00532.pdf
present an algorithm (#5 in Table 1) that appears to scale better with k than the alias table method (Walker’s algorithm). I don’t know the literature well, so I thought I’d mention this in case its new to folks but am generally interested in what are the fastest methods (big surprise :slight_smile: ).