# How to estimate uncertainty in a LinearMixedModel?

In this example, I want to estimate the uncertainty around the beta value for each `Cut & Clarity` pair. Iโm not sure how to interpret the `allpars` data. Is it producing a single sigma shared across all values of `Cut & Clarity`? Do I need to `simulate` in order to get what I want? I am not trying to get prediction error, I just want the uncertainty around the expected value at each point.

``````using RDatasets, MixedModels, DataFrames

d = dataset("ggplot2", "diamonds")
form = @formula(Price ~ (1|Cut & Clarity))
modelfit = fit(LinearMixedModel, form, d)
pb = parametricbootstrap(Random.GLOBAL_RNG, 1000, modelfit)
DataFrame(pb.allpars)

#= julia> DataFrame(pb.allpars)
3000ร5 DataFrame
Row โ iter   type    group          names        value
โ Int64  String  String?        String?      Float64
โโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
1 โ     1  ฮฒ       missing        (Intercept)  4054.44
2 โ     1  ฯ       Cut & Clarity  (Intercept)   831.115
3 โ     1  ฯ       residual       missing      3907.46
4 โ     2  ฮฒ       missing        (Intercept)  3925.81
5 โ     2  ฯ       Cut & Clarity  (Intercept)   726.458
6 โ     2  ฯ       residual       missing      3908.1
7 โ     3  ฮฒ       missing        (Intercept)  3941.74
8 โ     3  ฯ       Cut & Clarity  (Intercept)   621.993
9 โ     3  ฯ       residual       missing      3918.18
10 โ     4  ฮฒ       missing        (Intercept)  3987.62
11 โ     4  ฯ       Cut & Clarity  (Intercept)   811.536
12 โ     4  ฯ       residual       missing      3929.52
13 โ     5  ฮฒ       missing        (Intercept)  3818.16
14 โ     5  ฯ       Cut & Clarity  (Intercept)   609.15
15 โ     5  ฯ       residual       missing      3883.62
16 โ     6  ฮฒ       missing        (Intercept)  3786.48
17 โ     6  ฯ       Cut & Clarity  (Intercept)   850.474
โฎ   โ   โฎ      โฎ           โฎ             โฎ          โฎ
2985 โ   995  ฯ       residual       missing      3911.85
2986 โ   996  ฮฒ       missing        (Intercept)  3802.83
2987 โ   996  ฯ       Cut & Clarity  (Intercept)   720.439
2988 โ   996  ฯ       residual       missing      3898.91
2989 โ   997  ฮฒ       missing        (Intercept)  3782.61
2990 โ   997  ฯ       Cut & Clarity  (Intercept)   734.746
2991 โ   997  ฯ       residual       missing      3899.4
2992 โ   998  ฮฒ       missing        (Intercept)  3692.06
2993 โ   998  ฯ       Cut & Clarity  (Intercept)   677.055
2994 โ   998  ฯ       residual       missing      3916.26
2995 โ   999  ฮฒ       missing        (Intercept)  3856.67
2996 โ   999  ฯ       Cut & Clarity  (Intercept)   810.09
2997 โ   999  ฯ       residual       missing      3912.06
2998 โ  1000  ฮฒ       missing        (Intercept)  3725.55
2999 โ  1000  ฯ       Cut & Clarity  (Intercept)   743.319
3000 โ  1000  ฯ       residual       missing      3906.36
2966 rows omitted =#
``````

If youโre just looking for confidence intervals, have you looked `DataFrame(shortestcovint(pb))`?

This is shown in the docs: Parametric bootstrap for mixed-effects models ยท MixedModels

That produces an interval for a single `Cut & Clarity` parameter, but I was looking for an interval for every value of `Cut & Clarity`. I want to make a coefficient plot, to show the credible ranges for each value of `Cut` and `Clarity` .

``````pbdf = DataFrame(pb.allpars)
combine(groupby(pbdf, [:type, :group, :names]), :value => shortestcovint => :interval)

#= 3ร4 DataFrame
Row โ type    group          names        interval
โ String  String?        String?      Tupleโฆ
โโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
1 โ ฮฒ       missing        (Intercept)  (3600.75, 4104.61)
2 โ ฯ       Cut & Clarity  (Intercept)  (552.975, 906.98)
3 โ ฯ       residual       missing      (3892.35, 3939.2) =#
``````

This might be what I want. Does it make sense?

``````using RDatasets, MixedModels, DataFrames

d = dataset("ggplot2", "diamonds")
form = @formula(Price ~ (1|Cut & Clarity))
modelfit = fit(LinearMixedModel, form, d)
pb = parametricbootstrap(Random.GLOBAL_RNG, 1000, modelfit)
pbdf = DataFrame(pb.allpars)
combine(groupby(pbdf, [:type, :group, :names]), :value => shortestcovint => :interval)

d[!,:sim] = simulate(modelfit)
let c = combine(groupby(d, [:Cut,:Clarity]), :sim .=> [mean, sem])
layers = (data(c)
* visual(Errorbars)
* mapping(:Cut,:sim_mean,:sim_sem=>(x->2x);layout=:Clarity)
)
draw(layers; axis=(;xticklabelrotation=ฯ/8))
end

``````

Without knowing your inferential and presentation goals and potentially more about your study design, I donโt know whether it makes sense for your purposes. I donโt think a mixed model is the way to analyze the diamonds data, but the presentation you give is a way to get regularized โestimatesโ (technically predictions) of all those things.

Have you looked at MixedModelsMakie? `caterpillar` and `RanefInfo` will display similar information using the conditional variances instead of the bootstrap replicates.

I think this is the core thing Iโm getting stuck on โ what do you mean with โbetaโ? Usually, โbetaโ refers to the fixed-effect regression coefficients. The conditional modes / best linear unbiased predictions (BLUPs) / random effects are something else.

Ah yes, beta was not what I meant. I want to show my results in terms of the uncertainty around the expected outcome associated with each value of the categorical parameters. I find this outcome-scale presentation easier to interpret than raw parameter estimates, especially for generalized linear mixed models. It looks like `MixedModelsMakie.caterpillar` produces estimates in terms of the parameter values, rather than best predictions.

Based on

``````d[!,:sim] = simulate(modelfit)
let c = combine(groupby(d, [:Cut,:Clarity]), :sim .=> [mean, sem])
layers = (data(c)
* visual(Errorbars)
* mapping(:Cut,:sim_mean,:sim_sem=>(x->2x);layout=:Clarity)
)
draw(layers; axis=(;xticklabelrotation=ฯ/8))
end
``````

Running this code a few times, the predictions jump around quite a bit. E.g.

How should I get reliable estimates/predictions for each level of the categories?

`simulate` is only a single replication, so things will tend to change a fair amount for any parameter value. For the BLUPs though, they are completely new draws from the random effects distribution. This is one of the critical ways that random effects differ from fixed effects โ youโre estimating the distribution of the random effects, and predicting the individual values. (This is also why the BLUPs arenโt part of the bootstrap summary โ they arenโt parameters and the bootstrap summary only stores parameters.)

Those predictions are relative to the population-level values, i.e. centered at the corresponding fixed effects, so if you want to plot those for a simple intercepts-only model, you can do something like:

``````grp = Symbol("Cut & Clarity")
df = leftjoin(DataFrame(raneftables(modelfit)[grp]), DataFrame(condVartables(modelfit)[grp]); on=grp)
select!(df,
grp => ByRow(identity) => [:Cut, :Clarity],
"(Intercept)" => "mean",
:ฯ => ByRow(only) => "std")

layers = data(df) * visual(Errorbars) *
mapping(:Cut, :mean, :std => (x -> 1.96);layout=:Clarity)
draw(layers; axis=(; xticklabelrotation=ฯ/8))
``````

A similar result could also be achieved with MixedModelsMakie:

``````using CairoMakie, MixedModelsMakie
re = ranefinfo(modelfit, grp)
re.ranef[:, 1] .+= fixef(modelfit)[1] # 'absolute' prediction for each group instead of relative to the population effect
caterpillar!(Figure(), re)
``````

Iโm a little bit hesitant to provide any more concrete advice without knowing more about your inferential problem (and I probably donโt have time at the moment to help you with the details of your inferential problem), because this feels like an XY Problem.

1 Like