I am trying to fit a Besag York Mollié (BYM) along the lines of this Stan case study. However, I have encountered an issue fitting the model. I hope someone can help me out to figure out what has gone wrong (I am almost certain I miscoded something). Code is at the bottom of the page.
The issue
In addition to fixed effects, ICAR models include two random effects components: one handles the spatial autcorrelation and the second the leftover non-spatial heterogeneity. ICAR models can be used when the area is split up int areal units. Think US counties, countries, etc.
My problem comes when I select the distribution for the ϕ parameters. If I use filldist(Uniform(-1, 1), N)
, the model runs fine and returns the correct values for the fixed effects. But the spatial effects are wrong because the posterior is constrained. The space effects are often of interest themselves, so this is an issue.
If I choose larger values for ϕ, such as filldist(Uniform(-2, 2), N)
or filldist(Flat(), N)
, the model returns incorrect values for both the fixed and random effects.
Running regular regression models on simulated non-spatial data returns the correct values when I use filldist(Flat(), N)
, so I am guessing the error has to do with my specification of the model. Any advice or pointing me in the correct direction is appreciated.
using Plots, StatsPlots
using Turing, ReverseDiff, Memoization
using LinearAlgebra
#### enter data ####
N = 56
N_edges = 132
E = Vector( [ 1.4; 8.7; 3.0; 2.5; 4.3; 2.4; 8.1; 2.3; 2.0; 6.6; 4.4; 1.8; 1.1; 3.3; 7.8; 4.6; 1.1; 4.2; 5.5;
4.4; 10.5; 22.7; 8.8; 5.6; 15.5; 12.5; 6.0; 9.0; 14.4; 10.2; 4.8; 2.9; 7.0; 8.5; 12.3; 10.1; 12.7; 9.4;
7.2; 5.3; 18.8; 15.8; 4.3; 14.6; 50.7; 8.2; 5.6; 9.3; 88.7; 19.6; 3.4; 3.6; 5.7; 7.0; 4.2; 1.8; ] )
node1 = Vector( [1; 1; 1; 1; 2; 2; 3; 3; 4; 4; 4; 5; 5; 5; 5; 6; 7; 7; 7; 7; 9; 9; 9; 9; 9; 10; 10; 11; 13; 13; 14;
14; 14; 15; 15; 15; 16; 16; 16; 16; 17; 17; 18; 18; 18; 18; 18; 20; 21; 21; 23; 23; 23; 23; 23; 24; 24; 24; 24; 24; 24; 24;
24; 25; 25; 26; 26; 26; 27; 27; 27; 28; 28; 29; 29; 29; 30; 30; 30; 30; 30; 31; 31; 31; 31; 32; 33; 33; 34; 34; 34; 34; 34;
34; 34; 35; 35; 36; 36; 36; 37; 37; 38; 38; 38; 38; 38; 39; 39; 40; 40; 40; 41; 41; 41; 42; 42; 44; 44; 45; 46; 46; 47; 47;
47; 48; 49; 49; 49; 51; 52; 55] )
node2 = Vector( [5; 9; 11; 19; 7; 10; 6; 12; 18; 20; 28; 11; 12; 13; 19; 8; 10; 13; 16; 17; 11; 17; 19; 23; 29; 16; 22; 12; 17; 19; 31;
32; 35; 25; 29; 50; 17; 21; 22; 29; 19; 29; 20; 28; 33; 55; 56; 55; 29; 50; 29; 34; 36; 37; 39; 27; 30; 31; 44; 47; 48; 55;
56; 26; 29; 29; 42; 43; 31; 32; 55; 33; 45; 34; 43; 50; 38; 42; 44; 45; 56; 32; 35; 46; 47; 35; 45; 56; 39; 40; 42; 43; 51;
52; 54; 37; 46; 37; 39; 41; 41; 46; 42; 44; 49; 51; 54; 40; 41; 41; 49; 52; 46; 49; 53; 43; 51; 48; 49; 56; 47; 53; 48; 49;
53; 49; 52; 53; 54; 54; 54; 56] )
y = Vector( [9; 39; 11; 9; 15; 8; 26; 7; 6; 20; 13; 5; 3; 8; 17; 9; 2; 7; 9; 7; 16; 31; 11; 7; 19; 15; 7; 10; 16; 11; 5;
3; 7; 8; 11; 9; 11; 8; 6; 4; 10; 8; 2; 6; 19; 3; 2; 3; 28; 6; 1; 1; 1; 1; 0; 0] )
x = Vector( [ 1.6; 1.6; 1.0; 2.4; 1.0; 2.4; 1.0; 0.7; 0.7; 1.6; 0.7; 1.6; 1.0; 2.4; 0.7; 1.6; 1.0; 0.7; 0.7; 1.0; 0.7; 1.6; 1.0;
0.7; 0.1; 0.1; 0.7; 0.7; 1.0; 1.0; 0.7; 2.4; 1.0; 0.7; 0.7; 0.0; 1.0; 0.1; 1.6; 0.0; 0.1; 1.6; 1.6; 0.0; 0.1; 0.7;
0.1; 0.1; 0.0; 0.1; 0.1; 0.0; 0.1; 0.1; 1.6; 1.0 ] )
scaling_factor = 0.3984139
id_num = [1:56;]
log_E = log.(E)
n_adapt = 2_000
n_iter = 1_000
@model function bym2(N, node1, node2, y, x, E, sf, id_num)
# priors
α ~ Normal(0, 1)
β ~ Normal(0, 1)
σ ~ truncated(Normal(0, 1), 0, Inf)
#ϕ ~ filldist(Uniform(-1, 1), N). # correct values of FEs, but returns incorrect values for REs
#ϕ ~ filldist(Uniform(-2, 2), N) # Doesn't work
ϕ ~ filldist(Flat(), N) # Doesn't work
sum_ϕ ~ Normal(0, 0.0001 * N) # soft sum to zero constraint
θ ~ filldist(Normal(0, 1), N)
ρ ~ Beta(0.5, 0.5)
# model
sum_ϕ = sum(ϕ)
convolved_re = sqrt(1 - ρ) .* θ[id_num] .+ sqrt(ρ / sf) .* ϕ[id_num]
Turing.@addlogprob! -0.5 * dot(ϕ[node1] - ϕ[node2], ϕ[node1] - ϕ[node2]) # ICAR adjustment
λ = exp.(E .+ α .+ β * x .+ convolved_re .* σ)
y .~ Poisson.(λ)
return λ
end
Turing.setadbackend(:reversediff)
Turing.setrdcache(true)
model2 = bym2(N, node1, node2, y, x, log_E, scaling_factor, id_num)
@time chns2 = sample(model2, NUTS(n_adapt, 0.95), n_iter)