I want to estimate the Poisson distribution parameter λ (usually λ < 1). I am given N observations, however they are censored: I can only see if the value was 0 or greater than zero, i.e.
Instead of observing the vector latent
:
0 1 2 1 0 1 2 1 0 0 0 0 3
I see the y
:
0 1 1 1 0 1 1 1 0 0 0 0 1
Since there are only two possible values allowed, for efficiency I use different data: N
- total number of observations, nzero
- number of observations equal zero. Of course nzero
<= N
.
@model function poisson_model_fast(nzero::Int, N::Int)
λ ~ Uniform(0, 100)
Turing.@addlogprob! logpdf(Poisson(λ), 0)*nzero
Turing.@addlogprob! logccdf(Poisson(λ), 0)*(N-nzero)
end
The problem I have is that manually re-writing the target function, although perfectly legal, does seem hackish and overly complicated to me. I am the only person who can write Probabilistic Programs in the team, and I don’t want to introduce my mates with too steep learning curve.
Do you have any suggestions?
I’ve tried to use the following model, that uses the y
vector directly, but it gives wrong results. Do you know, why?
@model function poisson_model(y::Vector{Int})
λ ~ Uniform(0, 100)
y ~ truncated(Poisson(λ), 0, 1)
end
Here’s the whole of my code:
using Random
using Distributions
using Turing
using StatsPlots
Random.seed!(3)
N = 1000
nzero = 50
lat = rand(Poisson(3.), N)
y = ifelse.(lat .> 0, 1, 0)
@model function poisson_model_fast(nzero::Int, N::Int)
λ ~ Uniform(0, 100)
Turing.@addlogprob! logpdf(Poisson(λ), 0)*nzero
Turing.@addlogprob! logccdf(Poisson(λ), 0)*(N-nzero)
end
model = poisson_model_fast(nzero, N);
nsamples = 10000
nchains = 8
sampler = MH()
chains = sample(model, sampler, MCMCThreads(), nsamples, nchains, discard_initial = 500, thinning=50);
display(chains)
plot(chains[["λ"]]; colordim=:parameter, legend=true)
For reference, here’s (hopefully) the same model “model.bug
” in rjags
(input is a vector y
). You may notice, it uses slightly different logic, one which uses latent discrete vector lat
, and which uses a different censoring specification.
model{
for(i in 1:N){
y[i] ~ dinterval(lat[i], latmax[i])
lat[i] ~ dpois(lambda)
}
lambda ~ dunif(0, 100)
}
You would drive it with the following R script:
library(R2jags)
chain_count <- parallel::detectCores()
n.iter <- 10000
N <- 1000
lat <- rpois(N, 3.)
y <- ifelse(lat>0,1,0)
latmax <- ifelse(y==0, 100, 0)
out<-jags.parallel(data=data, inits=NULL, n.iter=n.iter,
parameters.to.save = 'lambda', model.file = 'model.bug', n.chains = chain_count)