 # Variational Inference of Mixture models?

I’m just cutting my teeth on Turing. (BTW, thanks for this amazing software.) Does Turing support VI of mixture models? I attempted advi against a gaussian mixture model, but without success as below. The error occurred at ` Z[ii] ~ Categorical(wtrue)`. Is it possible? Thanks!

``````using Distributions
using Turing, MCMCChains

# make causal distribution
ncomptrue = 3
mutrue = [1,2,3] # ground-truth component means
muprior = Normal(0, 3)
sigma = 0.1 # a-priori known
nsamples = 100
# components have equal probability
wtrue = [1,1,1]/ncomptrue
mixtrue = MixtureModel(Normal, [(mm, sigma) for mm in mutrue], wtrue)
Y = rand(mixtrue, nsamples)

# fit distribution with known # components
@model Mixmodel(x) = begin
N = length(x)
mu1 ~ muprior
mu2 ~ muprior
mu3 ~ muprior
mu = [mu1, mu2, mu3]
Z = Vector{Int}(undef, N)
for ii in 1:N
Z[ii] ~ Categorical(wtrue)
x[ii] ~ Normal(mu[Z[ii]], sigma)
end
end

model = Mixmodel(Y)

``````

Hi, you can use variational inference only for continuous parameters. In variational inference, the goal is to minimise the KL-divergence from the variational distribution to the true posterior. For this, we use gradient based optimisation in Turing, which requires the computation of gradients.

However, you can reformulate a mixture model such that it contains only continuous parameters.

The following model runs using ADVI.

``````@model function gmm(x, K)
N = length(x)
μ ~ filldist(Normal(0,1), K)
w ~ Dirichlet(K, 1)
for i in 1:N
x[i] ~ MixtureModel(Normal, μ, w)
end
end
``````

Alternatively, you could also use the `arraydist` function, i.e.

``````@model function gmm(x, K)
N = length(x)
μ ~ filldist(Normal(0,1), K)
w ~ Dirichlet(K, 1)
x ~ arraydist(map(i -> MixtureModel(Normal, μ, w), 1:N))
end
``````

Note that you might want to change the AD backend if you have a model with a lot of parameters. In this case it is probably not necessary, but for more involved models you might want to keep this in mind and possibly switch to a reverse mode AD. You can find details about this in the documentation of Turing.

`filldist(MixtureModel(Normal, μ, w), N)` is better in this case since the distribution is the same for all elements.

1 Like

Yes of course! Thank you very much!

One follow on question. If you could please explain why K is required in the function signature. Consider the two nearly-identical models (the only difference being whether K is given as a keyword argument.)

``````@model gmm(x, K=3) = begin
N = length(x)
μ ~ filldist(Normal(0,1), K)
w ~ Dirichlet(K, 1)
for ii in 1:N
x[ii] ~ MixtureModel(Normal, μ, w)
end
end

@model gmm2(x) = begin
K = 3
N = length(x)
μ = filldist(Normal(0,1), K)
w ~ Dirichlet(K, 1)
for ii in 1:N
x[ii] ~ MixtureModel(Normal, μ, w)
end
end

``````

The following code executes successfully

``````model = gmm(Y)
Changing the first line above to `model=gmm2(Y)` gives an error. How can this be when `gmm(Y)` and `gmm2(Y)` appear to be functionally equivalent?
This should be `μ ~ filldist(Normal(0,1), K)`. Yep; that was the problem…