How to (if possible) re-write this simple censored Poisson model without Turing.@addlogprob!

I want to estimate the Poisson distribution parameter λ (usually λ < 1). I am given N observations, however they are censored: I can only see if the value was 0 or greater than zero, i.e.

Instead of observing the vector latent:
0 1 2 1 0 1 2 1 0 0 0 0 3
I see the y:
0 1 1 1 0 1 1 1 0 0 0 0 1

Since there are only two possible values allowed, for efficiency I use different data: N - total number of observations, nzero - number of observations equal zero. Of course nzero <= N.

@model function poisson_model_fast(nzero::Int, N::Int)
    λ ~ Uniform(0, 100)
    Turing.@addlogprob! logpdf(Poisson(λ), 0)*nzero
    Turing.@addlogprob! logccdf(Poisson(λ), 0)*(N-nzero)
end

The problem I have is that manually re-writing the target function, although perfectly legal, does seem hackish and overly complicated to me. I am the only person who can write Probabilistic Programs in the team, and I don’t want to introduce my mates with too steep learning curve.

Do you have any suggestions?

I’ve tried to use the following model, that uses the y vector directly, but it gives wrong results. Do you know, why?

@model function poisson_model(y::Vector{Int})
    λ ~ Uniform(0, 100)
    y ~ truncated(Poisson(λ), 0, 1)
end

Here’s the whole of my code:

using Random
using Distributions
using Turing
using StatsPlots

Random.seed!(3)

N = 1000
nzero = 50
lat = rand(Poisson(3.), N)
y = ifelse.(lat .> 0, 1, 0)

@model function poisson_model_fast(nzero::Int, N::Int)
    λ ~ Uniform(0, 100)
    Turing.@addlogprob! logpdf(Poisson(λ), 0)*nzero
    Turing.@addlogprob! logccdf(Poisson(λ), 0)*(N-nzero)
end
model = poisson_model_fast(nzero, N);

nsamples = 10000
nchains = 8
sampler = MH()
chains = sample(model, sampler, MCMCThreads(), nsamples, nchains, discard_initial = 500, thinning=50);
display(chains)
plot(chains[["λ"]]; colordim=:parameter, legend=true)

For reference, here’s (hopefully) the same model “model.bug” in rjags (input is a vector y). You may notice, it uses slightly different logic, one which uses latent discrete vector lat, and which uses a different censoring specification.

model{
    for(i in 1:N){
      y[i] ~ dinterval(lat[i], latmax[i])
      lat[i] ~ dpois(lambda)
    }     
    lambda ~ dunif(0, 100)
}

You would drive it with the following R script:

library(R2jags)
chain_count <- parallel::detectCores()
n.iter <- 10000
N <- 1000
lat <- rpois(N, 3.)
y <- ifelse(lat>0,1,0)
latmax <- ifelse(y==0, 100, 0)
out<-jags.parallel(data=data, inits=NULL, n.iter=n.iter,
                     parameters.to.save = 'lambda', model.file = 'model.bug', n.chains = chain_count)

This part gives you the incorrect results because it’s not the same as the DGP you specified above. A truncated Poisson is different from this other distribution you have.

To be honest, using addlogprob! here is probably the best solution without deriving a different likelihood for the “indicator poisson” distribution thing you’ve got.

Thank you very much, @cpfiffer for the answer!

Can you explain me, what is DGP? I could not find it on Google.

Another question: What exactly do you mean by «“indicator poisson” distribution»? I know what is an indicator function, but I’m not sure I know what do you mean by “indicator distribution”. I suspect I am missing knowledge of an important keyword.

Do you think I can use a better sampling algorithm than Metropolis-Hastings? logccdf(Poisson(λ), 0) is differentiable, so in principle I should be able to use HamiltonianMC (or NUTS), but I get a veeery long, and completely cryptic error message when I try sampler = NUTS() (Mozilla Community Pastebin/V34xekjR (JavaScript)).

Data generating process

Another question: What exactly do you mean by «“indicator poisson” distribution»? I know what is an indicator function, but I’m not sure I know what do you mean by “indicator distribution”. I suspect I am missing knowledge of an important keyword.

I’m trying to make up a name to describe your distribution. No important keyword missing, this is a non-standard application and I’m just trying to find the words for it.

Yeah, MH is not the best sampler here. Your link is broken and cannot be debugged right now.

Maybe a simple way to implement this is with a Binomial distribution:

    nzero ~ Binomial(N, exp(-λ))

and explain the Poisson distribution formula for zero prob and the complement being 1-(zero prob), i.e. a sum of Bernoulli variables a.k.a Binomial distribution.
(see https://en.wikipedia.org/wiki/Poisson_distribution)

Yes, you are right! It would be if during the data generation process I was discarding values > 1, but I am aliasing them with value == 1. Thank you!!

What a bad luck with the Mozilla Pastebin. :wink: Here’s a classic pastebin: TaskFailedException nested task error: TaskFailedException - Pastebin.com

and

are identical bits of code. They effect the same change in the log-prob value (up to a constant of the binomial coefficient which doesn’t influence inference). If the comment:

was a reply to my post (it wasn’t marked as such), then I think I did not understand it or maybe I’m confused about the question.

No, it wasn’t indeed. It was a comment to the @cpfiffer answer about why the y ~ truncated(Poisson(λ), 0, 1) was wrong. I am apparently struggling with the discourse’ interface. Sorry for the confusion.

Here’s the refined model written in the more user-friendly way. What is great, is that the sampling errors when I use the NUTS sampler are gone!!

@model function binomial_model_fast(nzero::Int, N::Int)
    λ ~ Uniform(0, 100)
    nzero ~ Binomial(N, pdf(Poisson(λ), 0)) # pdf(Poisson(λ), 0) is proportional to exp(-λ), so we can just write `nzero ~ Binomial(N, exp(-λ))`
end
model = binomial_model_fast(nzero, N);

Thank you @Dan for the recommendation that it is actually a Binomial model. I am very grateful for this insight!

1 Like

Very well done, @Dan!

1 Like