So the Turing model doesn’t generate Y0,Y1, it takes them as input data and computes a likelihood which is multiplied by the priors to get the posterior for the parameters. HOWEVER, I see that the number of zeros informs the alpha and gamma parameters! So now I understand your concern better.
I think maybe what you want is something like this (wasn’t able to test it because having some issues with precompiling Turing on my machine:
using Turing, Random, Distributions
Random.seed!(746);
# Two useful functions
logistic(x) = exp(x) / (1 + exp(x))
z(N,x) = N - round(Int, N * x) # Calculates how many 'non-zeros' in a sample of size N and fraction x of zeros
### Characteristics of the true generating process that I would be trying to 'recover' using the Bayesian model
μ, τ = 0.1, 0.2
σ, λ = .02, .01
α, γ = 0.3, 0.7
N = 1000
# Simulate data from the outcome distributions of the control and treatment groups
z₀ = z(N,logistic(α)); z₁ = z(N,logistic(α + γ))
y₀ = [zeros(z₀); rand(LogNormal(μ,exp(σ)), N - z₀)] # z₀ zeros and N - z₀ LogNormals
y₁ = [zeros(z₁); rand(LogNormal(μ + τ, exp(σ + λ)), N - z₁)] # z₁ zeros and N - z₁ LogNormals
# Bayesian model
@model function outcome(Y₀,z₀,Y₁,z₁)
# Assumptions
μ ~ Normal(0, 5)
τ ~ Normal(0, 5)
σ ~ Normal(0, 5)
λ ~ Normal(0, 5)
α ~ Normal(0, 5)
γ ~ Normal(0, 5)
# Observations
z₀ ~ Dirac(z(length(Y₀),logistic(α)))
z₁ ~ Dirac(z(length(Y₁),logistic(α+γ)))
Y₀ .~ LogNormal(μ,exp(σ))
Y₁ .~ LogNormal(μ + τ, exp(σ + λ))
return Y₀, Y₁
end
# MCMC sampling
chain_outcome = sample(@views outcome(y₀[z₀+1:end], z₀, y₁[z₁+1:end] ,z₁), NUTS(0.65), 100);
# MCMC Results
summarystats(chain_outcome)
Essentially what this does is it completely rejects all values of alpha and gamma which produce the wrong z values… and then it uses just the non-zero values of the Y0 and Y1 vectors for the lognormal distributions.
Since the lognormals are all the same I just .~ them which should be fine.