Why isn't mixture model sampling in Turing?


I’m new to Julia but very excited by the possibilities. I’m particularly interested in using Turing to fit mixture models. However, in building up some non-trivial mixture examples, I’ve encountered an issue I just can’t figure. I wrote up the same model in R and Julia itself and it works fine in both cases so this is something about my understanding about how Turing works.

I have stripped down version of a binomial mixture model with multiple observations (i.e. each sample contains Q distinct observations, each its own binomial). To further reduce the problem, I feed the model the correct parameters except for the indicator variables. However, I cannot get Turing to sample the indicator parameters correctly. What is going wrong?! Any help would be greatly appreciated.

Here is the code for the model itself.

@model function binomial_replicate_mixture_model_p(t_counts,W,K,p,lambda)
    N,Q = size(t_counts)

    # Priors for the mixture weights
    #lambda ~ Dirichlet(K, 1.0)
    # Priors for the multinomial parameters

    z ~ filldist(Categorical(lambda),N)
    for n in 1:N
        for q in 1:Q          
           t_counts[n,q,:] ~ Multinomial(W[n,q],vec([1-p[z[n],q],p[z[n],q]]))
Simulation and control script

Here is the code for the data simulation:

using Distributions,LinearAlgebra
using CSV,DataFrames
using Random,Turing,MCMCChains

function sim_bmm(N,Q,K)
    v = rand(1:K,N)

    y_counts = zeros(Int64,N,Q)
    n_counts = zeros(Int64,N,Q)

    alpha = Matrix{Float64}(undef, K,Q)
    beta = Matrix{Float64}(undef, K,Q)
    p = Matrix{Float64}(undef, K,Q)

    for k in 1:K
        for q in 1:Q
            alpha[k,q] = rand([3.0,10.0,30.0])
            beta[k,q] = rand([3.0,10.0,30.0])
            p[k,q] = alpha[k,q]./(alpha[k,q]+beta[k,q])

    #p_mat = alpha ./ (alpha + beta)

    for i in 1:N
        for q in 1:Q
            tmp = rand(Multinomial(10000,vec([p[v[i],q],1-p[v[i],q]])),1)
            y_counts[i,q] = tmp[1]
            n_counts[i,q] = tmp[2]

    return y_counts,n_counts,p,v

And here’s the control script:

## simulation parameters
N = 100 ## num of samples 
Q = 5 ## num of questions
K = 7 ## num of components
y_c,n_c,p,v = sim_bmm(N,Q,K)
t_c = cat(y_c, n_c, dims=3)
W = y_c+n_c
lambda = fill(1/K,K)
## sampler parameters
sampler_p = Gibbs(MH(:z))

## model definition
bbmm_p = binomial_replicate_mixture_model_p(t_c,W,K,p,lambda)
chain_p = sample(bbmm_p,sampler_p,MCMCThreads(),nsamples,nchains);

I think you meant

bbmm_p = binomial_replicate_mixture_model_p(t_c,W,K,p,pi)

I am getting a TaskFailedException when executing the last line, I mean

, is that your question?

Thank you for taking a look! Foolishly, that was not my issue: I just didn’t edit down my code correctly from the problem I was working on to this example. I’ve updated the code so it reflects the problem I’m hoping folks can look at (not the ones I unintentionally presented).