I am trying to implement a simple species occupancy model, which makes use of a latent binary variable. I’ve tried to get inspiration from this repository, but got confused by the approach and tried to start from even simpler grounds.
The code is as follows. I start with simulating data:
Data Simulation
using Turing
using MCMCChains, StatsPlots
using Random
# Simulate the data
Random.seed!(12)
N_visits = 100
N_sites = 10
# True parameters
ψ = 0.8 # probability of occupancy
p = 0.4 # probability of detection
# Occupancy data
pres_vec = [rand(Bernoulli(ψ), N_visits) for i in 1:N_sites]
# Detection data
dets_vec = [rand.(Bernoulli.(pres*p)) for pres in pres_vec]
# Sites
sites = 1:N_sites
The model code follows:
# Model
@model occupancy(D, S) = begin
# Priors
ψ ~ Beta(1,1)
p ~ Beta(1,1)
# Initialize Z
Z = Vector{Int}(undef, length(S))
# Z ~ filldist(Bernoulli(ψ), length(S))
for site in S
Z[site] ~ Bernoulli(ψ)
μ = Z[site] * p
for visit in 1:length(D[site])
D[site][visit] ~ Bernoulli(μ)
end
end
end
Which I sample with HMC
.
Sampling
sampler = HMC(0.01, 10)
nsamples = 1000
chain = sample(occupancy(dets_vec, sites), sampler, nsamples);
plot(chain[["p", "ψ"]]; colordim=:parameter, legend=true)
This samples terribly.
Results
julia> chain |> summarystats
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ess_per_sec
Symbol Float64 Float64 Float64 Float64 Float64 Float64 Float64
ψ 0.8393 0.0000 0.0000 NaN NaN NaN NaN
p 0.8693 0.0000 NaN NaN NaN NaN NaN
Z[1] 1.0000 0.0000 NaN NaN NaN NaN NaN
Z[2] 1.0000 0.0000 NaN NaN NaN NaN NaN
Z[3] 1.0000 0.0000 NaN NaN NaN NaN NaN
Z[4] 1.0000 0.0000 NaN NaN NaN NaN NaN
Z[5] 1.0000 0.0000 NaN NaN NaN NaN NaN
Z[6] 1.0000 0.0000 NaN NaN NaN NaN NaN
Z[7] 1.0000 0.0000 NaN NaN NaN NaN NaN
Z[8] 1.0000 0.0000 NaN NaN NaN NaN NaN
Z[9] 1.0000 0.0000 NaN NaN NaN NaN NaN
Z[10] 1.0000 0.0000 NaN NaN NaN NaN NaN
I’d like some help in how to properly specify the latent Z
variable. I have tried to use a vector of undef
, as well as using filldist
, but I am not getting anywhere.
What am I doing wrong in my specification of this binary latent variable? Is Turing like stan in that you have to integrate out categorical latent variables?