Hi all,

I am trying to fit some diffusion data which we expect to be generated from two or more Rayleigh distributions.

To make some data for the MWE, take the following example:

```
function get_data(; μ₁ = 1, μ₂ = 2, N = 1000, fraction = 0.5)
dist1 = Rayleigh(μ₁)
dist2 = Rayleigh(μ₂)
N1 = round(Int, fraction * N)
N2 = round(Int, (1 - fraction) * N)
data = vcat(rand(dist1, N1), rand(dist2, N2))
return data
end
data = get_data(; fraction = 0.7);
```

As such, this is a finite mixture model of different Rayleigh distributions. At first, I experienced problems with label switching and thus I wanted to apply the `ordered`

method used in Stan. The closest I could find in Turing was this thread which correctly allows me to enforce ordering, however, at the cost of some more complex code (having to use `generated_quantities`

).

My function thus became:

```
@model function diffusion_ordered(Δr)
N = length(Δr)
ΔD ~ filldist(Exponential(), 2)
D = cumsum(ΔD)
θ ~ Uniform(0, 1)
w = [θ, 1 - θ]
dists = [Rayleigh(d) for d in D]
Δr ~ filldist(MixtureModel(dists, w), N)
return (; D)
end
```

This works well, and makes 1000 samples using 4 threads in 2-3 seconds (not counting compilation). The `rhat`

s also looks fine:

```
Summary Statistics
parameters mean std naive_se mcse ess rhat ess_per_sec
Symbol Float64 Float64 Float64 Float64 Float64 Float64 Float64
ΔD[1] 0.9419 0.0521 0.0008 0.0014 1451.6843 1.0020 41.0069
ΔD[2] 0.8301 0.0744 0.0012 0.0019 1696.2559 1.0016 47.9155
θ 0.6139 0.0726 0.0011 0.0021 1380.5319 1.0016 38.9970
```

However, this doesn’t scale well to more mixtures (e.g. `K=3`

mixtures), due to the `θ`

and `w`

definitions . Therefore, I tried a function with `w ~ Dirichlet(K, 1)`

instead:

```
@model function diffusion_ordered_dirichlet(Δr, K = 2)
N = length(Δr)
ΔD ~ filldist(Exponential(), K)
D = cumsum(ΔD)
w ~ Dirichlet(K, 1)
dists = [Rayleigh(d) for d in D]
Δr ~ filldist(MixtureModel(dists, w), N)
return (; D)
end
```

This generates equally well mixed chains:

```
Summary Statistics
parameters mean std naive_se mcse ess rhat ess_per_sec
Symbol Float64 Float64 Float64 Float64 Float64 Float64 Float64
ΔD[1] 0.9431 0.0518 0.0008 0.0010 2706.9331 1.0005 8.6276
ΔD[2] 0.8292 0.0724 0.0011 0.0013 3277.2988 1.0008 10.4455
w[1] 0.6152 0.0726 0.0011 0.0015 2516.3084 1.0008 8.0201
w[2] 0.3848 0.0726 0.0011 0.0015 2516.3084 1.0008 8.0201
```

but is much, much slower. This time it takes the model 80 seconds to fit.

I also tried another way:

```
@model function diffusion_ordered_Δw(Δr, K = 2)
N = length(Δr)
ΔD ~ filldist(Exponential(), K)
D = cumsum(ΔD)
Δw ~ filldist(Exponential(), K)
w = Δw / sum(Δw)
dists = [Rayleigh(d) for d in D]
Δr ~ filldist(MixtureModel(dists, w), N)
return (; D, w)
end
```

which is a lot faster (6 seconds) and also scales to more mixtures. It does, however, also depend on some more coding afterwards to extract the `w`

's using `generated_quantities`

).

My two main questions are:

- Is this really the way to force ordering as a constraint? Which also requires one to extract the relevant parameters afterwards with
`generated_quantities`

. - Why is
`Dirichlet`

so slow? Is know there was a type instability a few years back, but that should be long fixed by now.

Thanks a lot in advance!