Draw a random number through a probability distribution defined as an array

This will be slow for a large array — the expected number of times it calls rand() is proportional to the length of the array.

Note that this also takes time proportional to length(prob_dens) (though is faster because it only generates a single random number and scans the array consecutively), because the algorithm works by scanning a cumsum through the weights array to do inverse transform sampling.

If you need to generate many samples for a fixed weights table, it is probably faster to use AliasTables.jl, which allows you to precompute an efficient mapping from random numbers to indices via the alias method:

using AliasTables
at = AliasTable(vec(prob_dens))
CartesianIndices(prob_dens)[rand(at)]

e.g. this is almost 200x faster for me than sample for prob_dens = rand(100,200) (not including the one-time cost to generate the AliasTable).

6 Likes