I am trying to implement a hierarchical model, and am running into issues getting the model to return the parameters I used to generate a test data set. When I implement the model using TuringGLM I get the parameters I expect, but I think I must be doing something wrong when I try to set up the model by hand, rather than relying on a model formula. My TuringGLM formula is response ~ treatment * species + (1|genotype)
. Response is a measured value (rather than a count), treatment
consists of three categories, species
consists of four categories, and genotype
consists of three categories. Technically, each genotype is specific to the species, so could be nested, but I think the model works out the same either way.
The model I have tried to set up is:
@model function randomintercept_regression(y,X,idx; n_gr=length(unique(idx)), predictors=size(X, 2))
#priors, depends on data that has been rescaled
Ξ± ~ Normal(0, 10) # population-level intercept
Ξ² ~ filldist(Normal(0,10), predictors)
Ο ~ Exponential(10) # residual SD
#prior for variance of random intercepts
#usually requires thoughtful specification
Ο ~ truncated(Cauchy(0, 10); lower=0) #group level SDs
Ξ±β±Ό ~ filldist(Normal(0, Ο), n_gr) # group level intercepts
#likelihood
Ε· = Ξ± .+ X * Ξ² .+ Ξ±β±Ό[idx]
y ~ MvNormal(Ε·, Ο^2 * I)
end
where idx
is the list of genotypes coded as integers, X
is a matrix of the predictor variables for each individual in the study (coded as 1 and 0, with each column representing a different level of either treatment
, species
, or the interaction between the two) and y
represents the measured response of the individual. The fact that my TuringGLM model returns the correct parameter estimates suggests to me that it isnβt an issue with the structure of my data, rather how the model is set up. I have all the code associated with this project here, including code for data generation, should you want to do a really deep dive. Much of the code is adapted from this resource.
Thanks for your help!
I donβt think you are miss-specifying your Turing.jl
model.
TuringGLM.jl
uses different priors as default priors, which takes into account the underlying data, that might be the culprit here.
Can you post the estimates TuringGLM.jl
versus the Turing.jl
model youβve posted?
One other interesting idea would be to normalize all you data X
and response y
using ZScoreTransform
(take a look here: Bayesian-Statistics/02-linear_regression-kidiq.jl at main Β· storopoli/Bayesian-Statistics Β· GitHub) and proceed with Normal(0, 2.5)
priors for Ξ²
and Ξ±
, and Exponential(1)
for Ο
.
1 Like
Sure thing. Here are the estimates for the Turing.jl
model:
22Γ8 DataFrame
Row β parameters mean std naive_se mcse ess rhat ess_per_sec
β Symbol Float64 Float64 Float64 Float64 Float64 Float64 Float64
ββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1 β Ξ± -3.45131 5.76711 0.0911859 0.252182 230.766 1.02397 1.63194
2 β treat1_Control 0.813562 4.82649 0.0763135 0.211874 255.586 1.01681 1.80746
3 β treat2_Triticale -1.7054 4.33897 0.0686052 0.193385 365.632 1.02321 2.58569
4 β treat2_Wheat -1.96195 4.34805 0.0687488 0.197807 318.722 1.027 2.25395
5 β treat2_Barley 2.43193 4.34733 0.0687373 0.197601 320.813 1.02696 2.26874
6 β treat2_Oats -2.23604 4.35545 0.0688658 0.19943 308.04 1.02776 2.17841
7 β treat1_MeJA -3.31079 4.82681 0.0763186 0.211433 256.962 1.01672 1.81719
8 β treat1_Insect -3.10834 4.82561 0.0762996 0.211556 257.02 1.01669 1.8176
9 β Ο 1.07758 0.0523837 0.00082826 0.00407205 20.7791 1.28952 0.146946
10 β Ο 0.152454 0.139013 0.002198 0.0155327 11.6907 1.71629 0.0826748
11 β Ξ±β±Ό[1] -0.000636618 0.124812 0.00197345 0.00809698 52.625 1.08083 0.372155
12 β Ξ±β±Ό[2] -0.0461119 0.107025 0.00169221 0.00487228 83.2762 1.05705 0.588916
13 β Ξ±β±Ό[3] 0.0772738 0.152267 0.00240756 0.0137612 16.1005 1.41414 0.11386
14 β Ξ±β±Ό[4] -0.0550869 0.149021 0.00235623 0.0141615 14.2971 1.51724 0.101107
15 β Ξ±β±Ό[5] -0.0663971 0.170312 0.00269288 0.016902 13.7304 1.55641 0.0970988
16 β Ξ±β±Ό[6] 0.0555591 0.125597 0.00198586 0.00879491 30.5549 1.17043 0.216079
17 β Ξ±β±Ό[7] 0.0147474 0.11526 0.00182243 0.0077317 53.4461 1.08289 0.377962
18 β Ξ±β±Ό[8] -0.0221508 0.161494 0.00255344 0.0157733 14.5065 1.49708 0.102588
19 β Ξ±β±Ό[9] -0.00909111 0.099925 0.00157995 0.00275826 1996.17 1.00353 14.1166
20 β Ξ±β±Ό[10] -0.0792182 0.156167 0.00246922 0.0152744 14.0818 1.49238 0.0995839
21 β Ξ±β±Ό[11] 0.00108857 0.197207 0.00311812 0.0201408 15.2257 1.44926 0.107674
22 β Ξ±β±Ό[12] -0.0331863 0.1292 0.00204284 0.0103636 19.8629 1.29653 0.140467
And here are the estimates for the TuringGLM.jl
model:
20Γ8 DataFrame
Row β parameters mean std naive_se mcse ess rhat ess_per_sec
β Symbol Float64 Float64 Float64 Float64 Float64 Float64 Float64
ββββββΌββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1 β Ξ± -0.30134 0.185182 0.00292799 0.00596053 1051.16 1.00139 32.5437
2 β Ξ²[1] -4.62083 0.216361 0.00342096 0.00692687 1030.89 1.00269 31.9162
3 β Ξ²[2] -4.0301 0.226085 0.00357472 0.00873237 682.446 1.00343 21.1284
4 β Ξ²[3] -4.32156 0.223349 0.00353146 0.00798252 881.987 1.00341 27.3061
5 β Ξ²[4] -3.88406 0.162535 0.00256991 0.00304312 3126.17 0.99978 96.7853
6 β Ξ²[5] -4.08865 0.164911 0.00260747 0.00377165 2972.37 1.00138 92.0239
7 β Ο 1.05227 0.0476444 0.000753325 0.000765722 4253.41 1.0003 131.684
8 β Ο 0.102047 0.0878335 0.00138877 0.00380113 529.434 1.0037 16.3911
9 β zβ±Ό[1] -0.0284821 0.919094 0.0145322 0.0130367 4590.23 0.999867 142.112
10 β zβ±Ό[2] -0.283288 0.981594 0.0155204 0.0151085 3973.92 0.999478 123.031
11 β zβ±Ό[3] 0.187311 0.937938 0.0148301 0.0128819 4123.02 1.00007 127.648
12 β zβ±Ό[4] -0.117119 0.958306 0.0151521 0.0140993 4466.12 0.999459 138.27
13 β zβ±Ό[5] -0.222428 0.966174 0.0152765 0.0150198 3850.42 1.00036 119.208
14 β zβ±Ό[6] 0.22366 0.952422 0.0150591 0.0138048 4170.4 0.999588 129.115
15 β zβ±Ό[7] 0.121774 0.956345 0.0151211 0.0188772 3589.15 0.999184 111.119
16 β zβ±Ό[8] 0.13658 0.942066 0.0148954 0.0166025 3821.37 1.0002 118.309
17 β zβ±Ό[9] -0.0184732 0.939687 0.0148578 0.0169186 3493.39 0.999854 108.154
18 β zβ±Ό[10] -0.236547 0.923681 0.0146047 0.0138926 4463.18 0.999422 138.179
19 β zβ±Ό[11] 0.151902 0.969382 0.0153273 0.0145629 4167.17 1.00028 129.014
20 β zβ±Ό[12] -0.0126014 0.933396 0.0147583 0.0146358 4166.88 0.999517 129.006
Based on my sleuthing it looks like the TuringGLM model is incorporating the Barley
and Control
as part of the intercept, and then doing comparative estimates. But I back-calculated them at the math was pretty close.
For reference the parameters used in the data generation were:
19Γ2 DataFrame
Row β parameter coeff
β String15 Float64
ββββββΌβββββββββββββββββββββββ
1 β Control 0.0
2 β Insect -4.10827
3 β MeJA -4.17167
4 β Barley 0.0
5 β Oats -4.97833
6 β Triticale -4.11681
7 β Wheat -4.51245
8 β Barley1 0.634788
9 β Barley2 0.731347
10 β Barley3 0.620631
11 β Oats1 0.262506
12 β Oats2 0.480827
13 β Oats3 0.681654
14 β Triticale1 0.956313
15 β Triticale2 0.206814
16 β Triticale3 0.603935
17 β Wheat1 0.786783
18 β Wheat2 0.605044
19 β Wheat3 0.166197
How would normalization work with predictors that are purely 1 or 0? I looked into normalization, but wasnt sure if it would make a difference if the variables were dummy coded.
I tried again, this time adjusting the X matrix to exclude Barley
and Control
so they would be incorporated into the intercept. The new parameter estimates are:
20Γ8 DataFrame
Row β parameters mean std naive_se mcse ess rhat ess_per_sec
β Symbol Float64 Float64 Float64 Float64 Float64 Float64 Float64
ββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1 β Ξ± -0.218272 0.176584 0.00279204 0.00410713 1599.31 1.00187 18.9299
2 β Ξ²[1] -4.10688 0.221082 0.00349562 0.00534045 1838.74 1.00322 21.7639
3 β Ξ²[2] -4.40813 0.216661 0.00342572 0.00576301 1662.02 1.00416 19.6721
4 β Ξ²[3] -4.69131 0.21747 0.0034385 0.00515242 1814.77 1.00265 21.4801
5 β Ξ²[4] -4.13625 0.166991 0.00264036 0.00626846 322.938 1.01617 3.82238
6 β Ξ²[5] -3.92708 0.161498 0.00255351 0.0046016 1530.07 1.00321 18.1104
7 β Ο 1.05389 0.0513702 0.000812234 0.0023493 120.992 1.02733 1.4321
8 β Ο 0.111642 0.0884269 0.00139815 0.00555255 241.995 1.01163 2.86432
9 β Ξ±β±Ό[1] -0.000398632 0.115827 0.00183138 0.00189341 3743.89 1.0005 44.3137
10 β Ξ±β±Ό[2] -0.0353279 0.126083 0.00199355 0.00288435 1894.11 1.00021 22.4192
11 β Ξ±β±Ό[3] 0.0359535 0.124245 0.00196448 0.00310668 1929.02 0.999885 22.8324
12 β Ξ±β±Ό[4] -0.00971979 0.114741 0.00181422 0.00177179 2993.38 0.99945 35.4304
13 β Ξ±β±Ό[5] -0.0272379 0.126527 0.00200056 0.00251181 2555.22 1.00055 30.2444
14 β Ξ±β±Ό[6] 0.0423152 0.126711 0.00200348 0.00344255 1478.26 1.00011 17.4971
15 β Ξ±β±Ό[7] 0.00848105 0.114706 0.00181365 0.0021953 3374.93 0.999696 39.9467
16 β Ξ±β±Ό[8] 0.0131566 0.116819 0.00184707 0.00195295 2963.07 0.999978 35.0717
17 β Ξ±β±Ό[9] -0.0131799 0.116765 0.00184621 0.00276529 2158.81 1.00227 25.5522
18 β Ξ±β±Ό[10] -0.0341412 0.119848 0.00189497 0.00311182 1546.84 1.00051 18.3088
19 β Ξ±β±Ό[11] 0.0250325 0.116745 0.0018459 0.00275691 2733.87 1.00108 32.3589
20 β Ξ±β±Ό[12] 0.000353223 0.116472 0.00184158 0.00219533 3462.41 0.999878 40.9821
Which are much closer to the TuringGLM model, but have some slight differences.
The number of parameters are different. I think that TuringGLM is coding the categorical species using K-1 categories, try to use the same categorical coding for the plain Turing data.