How to sample many ordered iid variables in turing

swsw · March 21, 2025, 7:12am

Hi there,

I’d like to sample many ordered iid varialbles but i find when the number of samples is large the efficiency gets very low. For the subsequent code it works when the number of variables is 30 or so but not when it exceeds 40.

I specify that all of these variables X ~ Uniform(0,1) iid and that X[1]<X[2]<…<X[n]. I know this can be done by Dirichelet distribution but I want to an ordered Uniform is because I want to check the consistency between the results and the analytically calculation easily. I also know this can be done by just sampling n samples then ordering them but I just want to see how Turing.@addlogprob! -Inf can be used to handle it.

Is there a good solution to it?

using Turing, Distributions

@model function demo()
X ~ filldist(Uniform(0, 1), n)

for i in 1:(length(X)-1)
if !(X[i] < X[i+1])
Turing.@addlogprob! -Inf
return
end
end

return X
end

n=50 # n=30 works
n_samples = 10000
r = sort(rand(n)) * 0.8

@time chain = sample(demo(), NUTS(), n_samples, burnin=n_samples*0.5; thinning=20, progress=true, initial_params=r)

sethaxen · March 21, 2025, 10:29am

Your X parameter should be a random draw from the joint distribution of all n order statistics from a U(0, 1) sample of length n. You really don’t want to sample this with @addlogprob!, as there are n! ways to permute your individual draws, and only 1 of them will be ordered.

The easy way to handle these is

@model function foo(n)
    X ~ Bijectors.ordered(filldist(Uniform(), n))
end

For me this model hangs once n>12 for some unknown reason.

In your specific case, there’s another trick that will probably be even faster. Namely, let Z_i \sim \mathrm{Exponential}(1) for i \in 1\ldots n+1. If X_i = \frac{\sum_{j=1}^i Z_i}{\sum_{j=1}^{n+1} Z_j}, then [X_1, \ldots, X_n] follows the distribution you want. Distributions.JointOrderStatistics uses this method for random sampling. You could use this in a model like:

@model function bar(n)
    Z ~ filldist(Exponential(), n + 1)
    Zsum = cumsum(Z)
    X := view(Zsum, 1:n) ./ Zsum[n+1]
end

Here’s a demo for n=100:

julia> chns = @time sample(bar(100), NUTS(0.8), MCMCThreads(), 1_000, 4);
┌ Info: Found initial step size
└   ϵ = 1.6
┌ Info: Found initial step size
└   ϵ = 1.6
┌ Info: Found initial step size
└   ϵ = 1.6
┌ Info: Found initial step size
└   ϵ = 0.8
Sampling (4 threads) 100%|█████████████████████████████████████████████████████████████| Time: 0:00:10
 10.578692 seconds (62.78 M allocations: 69.804 GiB, 40.24% gc time)

julia> x_mean = vec(mean(sort(rand(10_000, 100); dims=2); dims=1));

julia> x_mean_turing = vec(mean(stack(MCMCChains.get_params(chns; flatten=false).X); dims=(1, 2)));

julia> maximum(abs, x_mean - x_mean_turing)

julia> isapprox(x_mean, x_mean_turing; rtol=0.01)
true

sethaxen · March 24, 2025, 2:34pm

Maybe this is what you mean by your original reference to the Dirichlet distribution, but another way you could do this is

@model function baz(n)
    Z ~ Dirichlet(ones(n + 1))
    X := cumsum(view(Z, 1:n))
end

This works because one can sample from Dirichlet(ones(n + 1)) by taking the L1-norm of n+1 IID draws from a standard exponential (see Dirichlet distribution - Wikipedia). But the first n entries of the resulting cumulative sum is equivalent to what we did above in the bar model. Here, however, a different Bijector will be used to unconstrain the distribution, so the sampling efficiency may be better/worse.

Topic		Replies	Views
How to sample ordered variables with different distributions in turing Probabilistic Programming question , package , turing	19	569	May 29, 2024
Turing, Ordered Variables, and the Dirichlet Distribution Probabilistic Programming turing	0	614	February 23, 2022
Sample dependent variables in Turing Probabilistic Programming question , turing	7	362	June 7, 2024
Help with Turing.jl model using Categorical, Normal and ordered constraint Probabilistic Programming turing	2	1848	July 16, 2021
Constrained ordered prior for ordered logistic model Probabilistic Programming turing , distributions , bayesian-inference	3	619	August 2, 2024

How to sample many ordered iid variables in turing

Related topics