Say I have 3 numbers - x,y,z, that must sum to 1. Is there a way to generate a random combination of these 3 numbers that respects this condition?
This problem is trivial if it’s 2-dimensional since you can individually generate x using the rand() function and y would consequently be 1-x, but that method wouldn’t work for 3 dimensions and above.
Ah very true, transforming random numbers can be very tricky. I generated 100,000 random triplets for each method and plotted their distributions:
Taking a uniform set of numbers and dividing by their sum appears to bias the numbers toward the mean of 1/3. The diff() method suggested by @Dan also produces a different distribution. Very interesting.
A better visualization would be to plot the bivariate density of (x.y) (ommiting z) as a surface : it should be the 3d simplex surface for uniformity.
After thinking about it, the univariate densities are not enough to conclude on the distribution of the triplet since you still lack the dependence structure. So you NEED to check bivariate uniformity. One bivariate density is enough since there are only two degree of freedom.
The 3 variables cannot be independent, as 2 determine the third. Additionally, they are negatively correlated (as a big value for x forces small values for y,z). The Dirichlet distribution may be uniform on the simplex of distributions, but the marginals will not be uniform in this case (as the method using log shows).
The Dirichlet with other parameters can allow concentrating the density on the corners or the center of the simplex (and even making biased distributions on the simplex).
So, I guess, the Dirichlet (or log method) is the ‘simplest’ choice (and thus preferable in some Occam’s razor sense).
If you are psychologically satisfied with this pattern, you could “naturally” extend it in the following manner.
If in case x,y do x=rand() and y=1-x.
In the x,y,z case you could do x=rand(), y=rand()*(1-x) and z=(1-y)*(1-x).
This can be extended to any size…
to implement it you could use the accumulate function…
The geometric representation of the unit simplex is the triangle ABC in the 3d space,
of vertices A(1,0,0), B(0,1,0), C(0,0,1).
Here is the result of uniform sampling from this simplex, using the variates from Exp(1), suggested by lrnv (it is a method thoroughly argumented in the Devroye’s book, page 207):
Similar sampling from Dirichlet, proposed by RobertGregg.
But the method of normalizing 3-uniform vectors is far from being uniform: