Translating Binomial N-mixture models from Jags

Hi there!

I’m new to Julia and Turing, but I’m trying to translate some ecological models by Kéry and Royle (2016) from JAGS to Turing. The specific model here is a Binomial N-mixture model (or a Poisson-Binomal mixture model). I have experimented with various samplers to deal with sampling discrete parameters, but unfortunately I can’t seem to find a way to obtain results close to those from the Jags implementation. The code for generating the data in Python and JAGS is as follows:

import pyjags
import numpy as np

n_sites = 150
n_surveys = 2
counts = np.zeros((n_sites, n_surveys))

expected_abundance = 2.5
p_detection = 0.4

# Production process
N = np.random.poisson(expected_abundance, size=n_sites)

# Observation process
for survey in range(n_surveys):
    counts[:, survey] = np.random.binomial(size=n_sites, n=N, p=p_detection)

code = """
model {
  lambda ~ dgamma(0.001, 0.001)
  p ~ dbeta(1, 1)

  for (i in 1:M) {
    N[i] ~ dpois(lambda)
    for (j in 1:J) {
      C[i,j] ~ dbin(p, N[i])
    }
  }
}
"""

M, J = counts.shape
model = pyjags.Model(
    code=code, 
    data={"C": counts, "M": M, "J": J},
    init={"N": counts.max(1), "p": 0.5, "lambda": 1}
)

_ = model.sample(5000, thin=25, vars=['lambda', 'p'])
samples = model.sample(25000, thin=25, vars=['lambda', 'p'])

which produces the following estimates:

         mean     sd  hdi_3%  hdi_97%  mcse_mean  mcse_sd  ess_mean  ess_sd  ess_bulk  ess_tail  r_hat
lambda  2.228  0.329   1.653    2.804      0.007    0.005    1978.0  1900.0    2221.0    2870.0    1.0
p       0.465  0.061   0.350    0.578      0.001    0.001    1916.0  1916.0    1992.0    2630.0    1.0

For Julia and Turing, I use the following lines of code:

using Random, Distributions, StatsBase, Turing

n_sites = 150
λ = 2.5

N = rand(Poisson(λ), n_sites)
p = 0.4
n_surveys = 2

C = Array{Int64, 2}(undef, n_sites, n_surveys)

for j = 1:n_surveys
    for i = 1:n_sites
        C[i, j] = rand(Binomial(N[i], p))
    end
end

@model function binomial_nmixture(C)
    n_sites, n_surveys = size(C)
    
    # NOTE Turing uses characterization using shape k and scale θ
    λ ~ Gamma(0.001, 1000) 
    p ~ Beta(1, 1)
    
    N = tzeros(Int, n_sites)
    for i = 1:n_sites
        N[i] ~ Poisson(λ)
        for j = 1:n_surveys
            C[i, j] ~ Binomial(N[i], p)
        end
    end
end

iterations = 10000
proposal, _ = findmax(C, dims=2)

chain = sample(
        binomial_nmixture(C), 
        SMC(),
        iterations, 
        progress=true)
end

But the results don’t come close to the true parameters. Increasing the number of iterations doesn’t seem to do much and I have also experimented with combining HMC with PG, but without much improvement. Does anyone have an idea of how to proceed here? Many thanks!

this prior for λ basically puts all mass at zero.

Maybe try plotting it or sampling from it to get a feel for whether the parameters make sense:

using StatsPlots
d = Gamma(0.001, 1000)
plot(d)

rand(d, 10)

You’re absolutely right, but these parameters are also used for the JAGS implementation. I tried with more sensible parameters, such as \lambda \sim \Gamma(k=2, \theta=2), but to no avail.

I’d guess this line is what makes it tricky: If just any C[i, :] .> N[i], the proposal will be rejected (since p(C>N|N) = 0).
Not sure how Jags deals with this. Probably doing smarter proposals (that guarantee N[i] >= C[i]) instead of Poisson-Proposals for N would help (within a simple Gibbs/MH sampler).

(sorry if this is not much help and you’re well aware of the above, I’m no expert with Turing. I’m just guessing with you)

I appreciate your guessing with me very much! :slight_smile: I’m gonna try and see if I can find some more information on how Jags handles such cases.

1 Like

I’d be very interested to know what jags is doing here. Using truncated Poisson priors (pass in a vector of per site maximum counts for the lower bounds) allows Turing to recapture p. But lambda is still way off, since low lambda values allow accurate estimation of N, due to the truncation.

Could you post a histogram of the posterior distribution of lambda from the jags model?

@FolgertK

This is a super hacky “solution”. But seems to work.


@model function binomial_nmixture(C,M)
    n_sites, n_surveys = size(C)
    λ ~ truncated(Gamma(0.001, 1000),0.1,10) # Truncated this as I was getting weird errors
                                             # apparently due to Inf values for lambda?
    p ~ Beta(1, 1)
    N ~ arraydist(Truncated.(Poisson.(fill(λ,n_sites)),M,10.0)) # truncated to avoid impossible values for N
    for i = 1:n_sites
    Turing.@addlogprob! -logpdf(Truncated(Poisson(λ),M[i],10.0),N[i]) # remove logprob due to truncated poisson
    Turing.@addlogprob! logpdf(Poisson(λ),N[i])                       # add logprob due to poisson
        for j = 1:n_surveys
            C[i, j] ~ Binomial(N[i], p)
        end
    end
end

iterations = 10000
proposal, _ = findmax(C, dims=2)

M = float.(vec(maximum(C,dims = 2)))
g = Gibbs(PG(30, :p), MH(:λ,:N) ) # The addlogprob! trick doesn't seem to work with all samplers. 
                                  # Works with MH. Used PG for p, as MH needs lots of iterations.
chain = sample(
        binomial_nmixture(C,M), 
        g,
        iterations, 
        progress=true)

To be clear, I don’t consider this to be a satisfactory solution. It was just the best I could come up with in the hour or so of trial and error before bed.

Technically, should be straightforward to define a custom distribution that produces random samples according to a truncated Poisson distribution, but with logpdf calculated according to a standard Poisson distribution. Whether or not this is a valid thing to do from a statistical point of view is another question.

2 Likes
 parameters      mean       std   naive_se      mcse         ess      rhat   ess_per_sec 
      Symbol   Float64   Float64    Float64   Float64     Float64   Float64       Float64 

           λ    2.4691    0.1329     0.0013    0.0106     33.7847    1.0332        0.0464
           p    0.4054    0.0177     0.0002    0.0004   1328.9485    1.0116        1.8242

So the estimates are closer than the jags example. But could be a quirk of the rand data generation. Would be nice to compare with the same dataset.

1 Like

Hi @EvoArt! Thanks so much for looking into this. Your solution might be a bit hacky, but it’s already much, much better than what I was getting. I’m still trying to uncover what Jags is doing exactly, but here are the density plots for the two parameters p and \lambda: