Binary latent variable for occupancy modelling using Turing

VLucet · November 11, 2023, 8:12pm

I am trying to implement a simple species occupancy model, which makes use of a latent binary variable. I’ve tried to get inspiration from this repository, but got confused by the approach and tried to start from even simpler grounds.

The code is as follows. I start with simulating data:

Data Simulation

using Turing
using MCMCChains, StatsPlots
using Random

# Simulate the data
Random.seed!(12)
N_visits = 100
N_sites = 10

# True parameters
ψ = 0.8 # probability of occupancy
p = 0.4 # probability of detection

# Occupancy data
pres_vec = [rand(Bernoulli(ψ), N_visits) for i in 1:N_sites]

# Detection data
dets_vec = [rand.(Bernoulli.(pres*p)) for pres in pres_vec]

# Sites
sites = 1:N_sites

The model code follows:

# Model
@model occupancy(D, S) = begin
    
    # Priors
    ψ ~ Beta(1,1)
    p ~ Beta(1,1)

    # Initialize Z
    Z = Vector{Int}(undef, length(S))
    # Z ~ filldist(Bernoulli(ψ), length(S))

    for site in S
    	Z[site] ~ Bernoulli(ψ)
    	μ = Z[site] * p
	    for visit in 1:length(D[site])
	        D[site][visit] ~ Bernoulli(μ)
	    end
	end

end

Which I sample with HMC.

Sampling

sampler =  HMC(0.01, 10)
nsamples = 1000
chain = sample(occupancy(dets_vec, sites), sampler, nsamples);

plot(chain[["p", "ψ"]]; colordim=:parameter, legend=true)

This samples terribly.

Results

julia> chain |> summarystats
Summary Statistics
  parameters      mean       std      mcse   ess_bulk   ess_tail      rhat   ess_per_sec 
      Symbol   Float64   Float64   Float64    Float64    Float64   Float64       Float64 

           ψ    0.8393    0.0000    0.0000        NaN        NaN       NaN           NaN
           p    0.8693    0.0000       NaN        NaN        NaN       NaN           NaN
        Z[1]    1.0000    0.0000       NaN        NaN        NaN       NaN           NaN
        Z[2]    1.0000    0.0000       NaN        NaN        NaN       NaN           NaN
        Z[3]    1.0000    0.0000       NaN        NaN        NaN       NaN           NaN
        Z[4]    1.0000    0.0000       NaN        NaN        NaN       NaN           NaN
        Z[5]    1.0000    0.0000       NaN        NaN        NaN       NaN           NaN
        Z[6]    1.0000    0.0000       NaN        NaN        NaN       NaN           NaN
        Z[7]    1.0000    0.0000       NaN        NaN        NaN       NaN           NaN
        Z[8]    1.0000    0.0000       NaN        NaN        NaN       NaN           NaN
        Z[9]    1.0000    0.0000       NaN        NaN        NaN       NaN           NaN
       Z[10]    1.0000    0.0000       NaN        NaN        NaN       NaN           NaN

I’d like some help in how to properly specify the latent Z variable. I have tried to use a vector of undef, as well as using filldist, but I am not getting anywhere.

What am I doing wrong in my specification of this binary latent variable? Is Turing like stan in that you have to integrate out categorical latent variables?

VLucet · November 25, 2023, 6:00pm

I have tried to implement the marginalization, to no avail.

using StatsFuns

@model occupancy(D) = begin

    # Q sums 
    Q = sum.(D)
    
    # Priors
    ψ ~ Beta(1,1)
    p ~ Beta(1,1)

    for (i, q) in enumerate(Q)
        if q > 0
            Turing.@addlogprob! log(ψ)
            for d in D[i]
                Turing.@addlogprob! logpdf(Bernoulli(p), d)
            end
        elseif q == 0
            visit_sum = log(ψ)
            for _ in D[i]
                visit_sum += log(1-p)
            end
            Turing.@addlogprob! logsumexp([visit_sum, log(1-ψ)])
        end
    end

YSanchezAraujo · December 1, 2023, 2:37pm

I’d say the first thing you’d want to do is maybe figure out where you can vectorize some operations (I don’t think this will do anything to the results, but. might help with speed) to then be able to write vectorized sampling statements e.g. Z .~ Bernoulli.(phi) one way I’ve accomplished that is by introducing indexing arrays that you feed into your model, Z[idxs] .~ Bernoulli.(phi) for example.

But actually it looks like the index of Z does not matter, so you can do

Z ~ filldist(dist, S) # this will both create Z and sample

But aside from that, it might be helpful to sample in the space of Gaussians, and then transform to the appropriate range:

# instead of
p ~ Beta(1,1);

# try this
lam_p ~ Normal(-1, 1);
p = logistic(lam_p);

sampling from Gaussians in my experience, across almost all problems leads to faster models, better rhat, and larger ESS. Finally, try NUTS() , maybe this will work?

@model function occupancy(D, S) 
    lam_psi ~ Normal(-1, 0.5) # will have to play around with these
    lam_p ~ Normal(-1, 0.5) # to get the correct setting, or set them as hyper priors

    psi = logistic(lam_psi)
    p = logistic(lam_p)

    Z ~ filldist(Bernoulli(psi), S)

    u = Z * p

    for site in 1:S
        D[site] .~ Bernoulli.(u)
    end
end

sampler = NUTS()
nsamples = 1000
chain = sample(occupancy(dets_vec, sites), sampler, nsamples);

simonbrauer · December 1, 2023, 10:06pm

Your marginalized model is working for me. However, I don’t think your data generation matches your models.

This produces a 10 x 100 array of true occupation data (pres_vec) and a 10 x 100 array of detection data (dets_vec). Based on the occupation data, it would seem that the animal is truly moving in and out of each site, as if you have one sensor at each site and you’re collecting a time series.

However, your models (and, I think, the two resources you linked) suggest that they’re either in the whole time or out, as if you’re collecting data from 100 sensors per site at one time point. For example

specifies only one true occupancy datum per site.

Assuming that model is correct (and I apologize, as I’m not familiar with these models so I might be making some naïve assumptions), then you probably want something more like this, where you first simulate a single occupancy observation per site and then simulate your detection data per visit based on the occupancy datum.

pres_vec = rand(Bernoulli(ψ), N_sites)
dets_vec = [rand.(Bernoulli.(pres * p), N_visits) for pres in pres_vec]

VLucet · December 1, 2023, 10:08pm

Yes, I think that is the solution! Thanks for catching this @simonbrauer , it was driving me crazy.

Topic		Replies	Views
Occupancy model - jags to julia - NUTS sampler New to Julia turing	0	319	March 22, 2022
Why isn't mixture model sampling in Turing? General Usage question	2	138	June 18, 2024
Problem Turing bug multivariate Bernoulli General Usage turing	3	184	June 5, 2024
Latent Variables New to Julia question , turing	5	825	August 29, 2020
Mixture models with binary indicator variables Probabilistic programming	11	912	February 18, 2021

Binary latent variable for occupancy modelling using Turing

Related topics