Adding related work to a multilevel Turing model

rikh · April 13, 2021, 11:01am

Suppose that we have fitted a multilevel model over data from multiple groups called A, B, C and D and that we’re trying to estimate the population means for these 4 groups. For example, this could look something like the figure below.

Now, suppose we also have information from the literature on a fifth group E, for which the sample mean mₑ, standard deviation sₑ and distribution are known. The distribution is a normal distribution. Could this information be added to the model in a way that E shows up in the posterior too?

I think that it would technically be possible by generating a bunch of samples with Normal(mₑ, sₑ), but I wonder if this actually makes sense.

opera_malenky · April 13, 2021, 8:33pm

I don’t know about doing it in Turing (easily or otherwise), but I the approach I would take is to make a sampler that alternates between (1) estimating the posterior mean and variance of group E analytically (e.g., using Gibbs sampling, for example), and then (2) updating other model parameters, conditional on the data and the parameter sampled in (1).

I believe there may have been (or are) discussion about doing such conditional Gibbs steps inside a Turing model, but I don’t know if those discussions were just notional, if there’s actually code out there you might test, or if there’s maybe some sort of hack that will let you do a Gibbs step inside a regular Turing model.

EvoArt · April 13, 2021, 9:58pm

Not sure if I know exactly what you’re looking for. Since you mention multilevel, I’ll assume you’re interested in partial pooling, so that E informs A -D.

I guess you could condition your hyper prior for the means of A-D on a data point that represents the known mean of E.

mu ~ Normal(0,1)
GroupMu ~filldist(Normal(mu,sigma),4)

#rest of model

Known_E ~ Normal(mu,sigma)

rikh · April 14, 2021, 12:26pm

@opera_malenky Thanks! That sounds like an interesting approach, I will look into it.

But that way I cannot pass the variance, because the variance is zero for one point. Am I correct?

Thank you both for your answers. Still, I didn’t ask it very clearly, but I am also wondering if it makes sense from a scientific perspective? Maybe, you also have thoughts on that.

After thinking about it some more, I think it is quite a bad idea to combine it in my suggested way. I guess that it’s much more natural to put the distributions (in my example E) in the prior. Then, sample a bit from the prior and show that in the paper and, next, sample from the posterior to show how things have changed based on the new information, but I’m very unsure about whether this is the correct thing to do.

opera_malenky · April 14, 2021, 12:43pm

I think so. It’s basically what people do in random-effects meta-analysis. What you are describing here is a lot like like the famous “eight schools” example of a multilevel model, with the biggest difference being that for 4 of your 5 groups, you have individual data, rather than just observed summary statistics.

Topic		Replies	Views
Speed up multilevel model in Turing.jl Probabilistic programming turing	15	479	June 17, 2023
Translating a TuringGLM model to a Turing model Probabilistic programming	4	412	December 25, 2022
Using a posterior from a previous sample as a prior Probabilistic programming turing	2	809	April 25, 2021
Specify a separate MvNormal for each observation of a variable in Turing Probabilistic programming turing	2	835	November 13, 2020
Problem Turing bug multivariate Bernoulli General Usage turing	3	187	June 5, 2024

Adding related work to a multilevel Turing model

Related topics