Turing model producing unexpected summary statistics

juliohm · December 18, 2021, 11:35pm

I have a very simple Turing model where all variables are Normal or TruncatedNormal. For some reason, the summary statistics of the model are not the ones I expected. Can you please help with this MWE?

using Turing
using Distributions
using LinearAlgebra
using Random

@model function Model(X, Y, params)
  # size of design matrix
  n, p = size(X)

  # hyperparameters
  σF = params.σF
  σP = params.σP
  σM = params.σM
  τF = params.τF
  τP = params.τP
  τM = params.τM
  cF = params.cF
  cP = params.cP
  cM = params.cM

  αF ~ Normal(cF, σF)
  αP ~ Normal(cP, σP)
  αM ~ Normal(cM, σM)
  
  βF ~ MvNormal(τF^2 * I(p))
  βP ~ MvNormal(τP^2 * I(p))
  βM ~ MvNormal(τM^2 * I(p))

  for i in 1:n
    x = view(X, i, :)

    NF = Normal(αF + βF⋅x, σF)
    NP = Normal(αP + βP⋅x, σP)
    NM = Normal(αM + βM⋅x, σM)

    Y[i, 1] ~ Truncated(NF, 1000.0, 4000.0)
    Y[i, 2] ~ Truncated(NP, 0.0, Y[i, 1])
    Y[i, 3] ~ Truncated(NM, 0.2, 1.8)
  end
end

# ------------
# MAIN SCRIPT
# ------------

Random.seed!(2021)

X = randn(100, 10)

Y = [2200 80 1] .+ randn(100, 3)

params = (
  σF=80.0, σP=3.0, σM=0.07,
  τF=80.0, τP=3.0, τM=0.07,
  cF=2365.67, cP=81.86, cM=0.93,
  A=106.0
)

model = Model(X, Y, params)

prior = sample(model, Prior(), MCMCThreads(), 125, 8)

The summary statistics say that the betas have non-zero mean:

Summary Statistics
  parameters        mean       std   naive_se      mcse         ess      rhat   ess_per_sec 
      Symbol     Float64   Float64    Float64   Float64     Float64   Float64       Float64 

          αF   2367.5469   77.2003     2.4413    1.9845   1050.1537    0.9952      252.4408
          αP     81.8590    3.0893     0.0977    0.1060    970.1233    0.9982      233.2027
          αM      0.9303    0.0677     0.0021    0.0025    943.5514    1.0003      226.8152
       βF[1]      1.6034   80.3508     2.5409    2.2824   1023.2686    0.9990      245.9780
       βF[2]     -3.7451   82.0924     2.5960    2.5802    961.3972    1.0008      231.1051
       βF[3]      1.1207   80.0799     2.5323    2.7361   1018.4947    0.9976      244.8304
       βF[4]     -1.6076   78.7029     2.4888    2.3270   1082.2488    0.9969      260.1560
       βF[5]     -2.7930   79.1165     2.5019    2.5547    893.2253    0.9989      214.7176
       βF[6]     -2.7102   79.5150     2.5145    1.8956    986.7812    0.9955      237.2070
       βF[7]     -2.7330   80.4840     2.5451    2.4614   1180.3496    1.0000      283.7379
       βF[8]     -0.6590   79.4045     2.5110    2.6443   1043.9858    1.0049      250.9581
       βF[9]     -3.1116   82.5449     2.6103    2.4207   1066.4970    0.9977      256.3695
      ⋮            ⋮          ⋮         ⋮          ⋮          ⋮          ⋮           ⋮
                                                                              21 rows omitted

Am I missing something? I’ve been debugging this code for hours and can’t find out why it is not producing betas with zero mean.

Stephen_Wild · December 19, 2021, 2:13am

In this case, I think there are two reasons. The first reason is because you have set the prior for your βF coefficients to have a mean 0 and sd of 80. So in this case being off by a mean of -3.7 means that your estimated mean is off from the true mean by 1/25th of a standard deviation, which is pretty small.

I suggest using a tighter prior. It should make a difference. If you look at βP or βM, for example, they should be pretty close to zero, given their priors.

The second reason is that you are drawing too few samples given the size of your sd. You have roughly 1000 draws, which in this case is going to result in the beta’s being away from 0. Set it your samples to 12500 (e.g., prior = sample(model, Prior(), MCMCThreads(), 12500, 8) and the means will be much closer to zero. They won’t be zero, but they’ll be much closer.

I hope that helps.

juliohm · December 19, 2021, 2:25am

Thank you @Stephen_Wild , appreciate your help.

Topic		Replies	Views
Summary statistics for arbitrary variables within Turing model Probabilistic programming turing	2	443	July 15, 2020
Help -- can't get Turing to work on a simple model Probabilistic programming turing , stan	4	127	November 6, 2024
A simple model which I can't get Turing to fit to simulated data Probabilistic programming	1	436	May 24, 2022
Turing: help with slow model Probabilistic programming	46	2511	February 13, 2021
Help with first non-trivial Turing example Statistics	17	1282	May 7, 2020

Turing model producing unexpected summary statistics

Related topics