Help translating a JAGS model to Turing

I’m a beginner with Turing.jl, and am just trying to get to grips with it by translating some examples from the book Bayesian Cognitive Modelling: a practical course. I’ve had some luck with a number of models, but have gotten stuck with this model…


  # Observed Returns
  for (i in 1:m){
     k[i] ~ dbin(theta,n)
  # Priors on Rate Theta and Number n
  theta ~ dbeta(1,1)
  n ~ dcat(p[])
  for (i in 1:nmax){
    p[i] <- 1/nmax

where the observed data is:

nmax = 500
k = [16 18 22 25 27]
m = length(k)


using Turing, StatsPlots

@model function survey(k, p)
    m = length(k)
    θ ~ Beta(1, 1)    
    n ~ Categorical(p)
    for i in 1:m
        k[i] ~ Binomial(n, θ)

# data
k = [16, 18, 22, 25, 27]
nmax = 500
p = ones(nmax)/nmax
# sample
chain = sample(survey(k, p), HMC(0.05, 10), 1000)
# plot
marginalkde(chain[:n], chain[:θ])

but it gives this error upon sampling

ERROR: InexactError: Int64(329.0074058217222)

I suspect it’s something to do with n ~ Categorical(p) but would appreciate any pointers to fix this.

I’m certainly no expert on MCMC, but I suspect that HMC requires everything in your model to be differentiable, but the Integer components are not. What happens if you choose another sampler, like Metropolis Hastings?


Perfect… I got too fixated on the model specification, I forgot to think about the sampling algorithm. That works fine with MH() as the sampler. Thanks very much

FYI for problems like this you could use the Gibbs compositional sampler to use HMC and friends to sample θ and another sampler like PG to sample discrete parameters like n, e.g.

chain = sample(survey(k, p), Gibbs(HMC(0.1, 10, :θ), PG(10, :n)), 1000)

see for instance the HMM tutorial.


Nice. I’m entirely new to compositional sampling, but this looks pretty cool. I might try to do a comparison of the samples with MH vs your suggestion. This example is quite good to show limitations of MH, the marginalkde plot shows some differences just from a quick test.

I haven’t used the compositional Gibbs sampler myself much, so I can’t tell much about performance or give other hints, but I think it’s main aim is exactly to be able to deal with these cases where you have some continuous and discrete parameters in your model (see the original Turing paper for instance). The idea behind it is very similar to a classical Gibbs sampler as far as I know, sampling each block of parameters that are assigned the same sampler while conditioning on on the other parameters.


Also note that it is common to just integrate out discrete parameters (if feasible) to use a sampler that requires differentiability on \mathbb{R}^n.



If you care about discrete RVs and know the Gibbs conditional to sample it you can also use GibbsConditional in Turing which allows you to specify the Gibbs conditional by hand. This can work better than black box particle Gibbs sampling.