Get siginificance of main fixed category effects in mixed linear model with MixedModels.jl

Hi group,
I have a dataframe of >10000 rows of repeated measurements on 25 subjects (OB) on 3 categorical factors (QF,B,PA) and 1 continuous (CCT) factor. The first few lines look like this:
│ Row │ OB │ QF │ BG │ PA │ CCT │ VAL │
│ 1 │ “1” │ “B” │ “B” │ “A” │ 3000.0 │ 6.0 │
│ 2 │ “1” │ “B” │ “B” │ “A” │ 3500.0 │ 8.0 │
│ 3 │ “1” │ “B” │ “B” │ “A” │ 4000.0 │ 8.0 │
│ 4 │ “1” │ “B” │ “B” │ “A” │ 5000.0 │ 2.0 │
│ 5 │ “1” │ “B” │ “B” │ “A” │ 6000.0 │ 1.0 │
│ 6 │ “1” │ “B” │ “B” │ “R” │ 3000.0 │ 1.0 │

dump(df) gives:

DataFrames.DataFrame 11250 observations of 6 variables
OB: DataArrays.PooledDataArray{String,UInt8,1}(11250)
String[“1”, “1”, “1”, “1”]
QF: DataArrays.PooledDataArray{String,UInt8,1}(11250) String[“B”, “B”, “B”, “B”]
BG: DataArrays.PooledDataArray{String,UInt8,1}(11250) String[“B”, “B”, “B”, “B”]
PA: DataArrays.PooledDataArray{String,UInt8,1}(11250) String[“A”, “A”, “A”, “A”]
CCT: DataArrays.DataArray{Float64,1}(11250)
[3000.0, 3500.0, 4000.0, 5000.0]
VAL: DataArrays.DataArray{Float64,1}(11250) [6.0, 8.0, 8.0, 2.0]

I want to find for each of the factors (3 categorical and 1 continuous) whether they have a significant impact on VAL.
If I run:
m = fit!(lmm(@formula(VAL ~ QF * PA * BG * CCT+ (QF + PA + BG + CCT | OB)), df))

I get for each of the fixed effects (and interactions) the p-values for the contrast between each the individual factor levels with the first factor level:

Fixed-effects parameters:
Estimate Std.Error z value P(>|z|)
(Intercept) 6.90138 0.712937 9.68021 <1e-21
QF: BR -5.35172 0.895662 -5.97516 <1e-8
QF: CA 0.524138 0.843195 0.621609 0.5342
QF: OA 0.355862 0.847164 0.420063 0.6744
QF: V 0.164483 0.845997 0.194425 0.8458
QF: W 6.28 0.873654 7.1882 <1e-12
PA: B -0.32 0.842267 -0.379927 0.7040
PA: G -2.64345 0.848704 -3.11469 0.0018
PA: R -2.64724 0.859032 -3.08166 0.0021
PA: Y -0.757241 0.847848 -0.893134 0.3718
BG: G -1.30621 0.845539 -1.54482 0.1224
BG: W -1.27931 0.850409 -1.50435 0.1325
CCT -0.000172414 0.000160561 -1.07382 0.2829
QF: BR & PA: B 0.222069 1.19066 0.18651 0.8520

Although I could conclude that (after e.g. a Bonferonni correction) a factor is significant when at least one contrast is significant, it could always be that by chance the contrasts with the first level would not be. It also does not provide me with p-values for the significance of the total factor.
Is there a good way to do this repeated measures analysis? Can I actually use mixed linear models? What should I change in the above model fit?

It would help if you mentioned that you are using the MixedModels package.

I don’t understand how you expect to check on whether a factor is significant in a model where all of the factors are involved in a fourth-order interaction.

I haven’t advertised it as I am still looking at the name and formulation but there is a likelihood-ratio test function, MixedModels.lrt but to use it you need to fit both the alternative model and the null model explicitly.


Thanks for the quick response.

So, yes, I 'm using MixedModels.jl.

From your reply I understand that it’s not possible to check the siginificance of a main factor when there are (4) interactions? I have been reading up a bit more on this, and maybe I should have mentioned that I meant ‘simple main effects’. Would it be possible then or is it only when there are less interactions?

Anyway, even if I don’t use 4 factors but just perform it for 2 (with interaction) I get an output for the significance of the individual (categorical) factor levels:
m = fit!(lmm(@formula(VAL ~ 1 + BG + CCT + ( BG + CCT | OB)), df))
results in:

Fixed-effects parameters:
Estimate Std.Error z value P(>|z|)
(Intercept) 6.26582 0.48773 12.8469 <1e-37
BG: G 0.0405333 0.051211 0.791497 0.4287
BG: W -0.1248 0.109463 -1.14011 0.2542
CCT -0.000110253 9.89829e-5 -1.11386 0.2653

with for BG contrasts to the first factor level.

Is there a way to specify specific contrasts in the MixedModels package (this way I could loop through all of them)?

I’ll try the MixedModels.lrt function. But I have some questions on that:

  1. what would the null model be? would it be the full model minus the single factor for which I want the sigificance or one where I omit all fixed effects?
  2. whathever the null model, the full model would still be using contrasts with the first level of a factor, so this could for example be non-significant for all other levels while one or more of the other levels have significant contrasts with each other. Would the lrt function take that into account? Or would I need to call lrt on different models where the contrasts are changed? (and how would I change those contrast?).

Finally, CCT is a continuous factor (covariate), is the signifiance for CCT obtained using a linear relationship (or quadratic or …)?

Sorry for the pile of questions, but I’m really trying to understand how to best implement this in Julia.

Many thanks for your time!

1 Like

In version 0.10.0 of MixedModels, a named argument contrasts was added to lmm. Its value, which should be a Dict{Symbol, AbstractContrasts}, is passed to the ModelFrame constructor. See

using DataFrames

The rest of this is TL;DR ranting about statistical tests and the interpretation of such tests that is frequently taught.

Rather than saying “it’s not possible to check the significance of a main factor when there are interactions” I would say that “it is not possible to unambiguously define a significance level for a main effect in the presence of non-negligible interactions”.

When R. A. Fisher developed analysis of variance tests he did so on geometric principles, hence concepts like “degrees of freedom” that correspond to dimensions of particular linear subspaces of a vector space. Earlier work like Gosset’s derivation of the Student’s T distribution were done with brute force analysis but Fisher saw an elegant geometric interpretation. He wrote to Gosset about it and Gosset checked with Pearson, the most renowned statistician of the time, and neither of them understood what Fisher was talking about.

In any case Fisher generalized the idea of testing whether a model could be simplified to a submodel without paying too great a penalty in the ability to fit the observed data. It involves decomposing the response into orthogonal components in certain subspaces and comparing the squared lengths of these components normalized by the dimension of the subspaces. Fisher understood the relationship between the geometry and the statistical assumptions related to spherical normal distributions and linear predictors, to use the modern terminology.

This was all being done in the first half of the 20th century when the most sophisticated computation was pencil and paper or, perhaps, a desk calculator. Fisher realized that in certain cases, now called “balanced, orthogonal designs”, short cuts for the necessary tedious calculations existed. Over the course of time the emphasis, especially in areas where these techniques were applied, shifted to the short cuts and the resulting table but losing the context.

The biggest simplification in these balanced, orthogonal cases is that you can do multiple tests based on the results of fitting only one model - the most complex one. You can fit this one model and get a table that summarizes many different tests - even if the tests don’t make sense.

There isn’t a p-value or a “significance level” associated with a term in a model. These values are associated with a comparison of two models. To understand how to interpret the result you need to know both the “null” model and the “alternative” model. The null model is a special case of the alternative.

Leaving aside the issue of random effects for a moment suppose that you have factors A and B each with two levels, as in

julia> using DataFrames, GLM

julia> dat = DataFrame(A = pool(repeat(['N','Y'], inner = 20, outer = 2)),
                       B = pool(repeat(['L','H'], inner = 40)),
                       Y = zeros(80));

julia> form = @formula(Y ~ 1 + A * B);

julia> mf = ModelFrame(form, dat);

julia> mm = ModelMatrix(mf);

julia> srand(1234321);

julia> b = randn(4);

julia> dat[:Y][:] = mm.m * b + 0.2 * randn(80);

julia> show(fit(LinearModel, form, dat))

Formula: Y ~ 1 + A + B + A & B

               Estimate Std.Error  t value Pr(>|t|)
(Intercept)  -0.0891865 0.0482815 -1.84722   0.0686
A: Y            -1.6652 0.0682804 -24.3877   <1e-36
B: L          -0.492592 0.0682804 -7.21426    <1e-9
A: Y & B: L    -1.08836 0.0965631 -11.2709   <1e-17

julia> show(fit(LinearModel, form, dat, contrasts = Dict(:B => HelmertCoding())))

Formula: Y ~ 1 + A + B + A & B

              Estimate Std.Error  t value Pr(>|t|)
(Intercept)  -0.335483 0.0341402 -9.82662   <1e-14
A: Y          -2.20938 0.0482815 -45.7603   <1e-56
B: L         -0.246296 0.0341402 -7.21426    <1e-9
A: Y & B: L  -0.544178 0.0482815 -11.2709   <1e-17

The point here is that the estimate for the A coefficient and its t-statistic and the p-value have changed even though the model is the same in the sense that the fitted values and the residuals, etc. are unchanged and A is not changed.

In terms of models, the comparison is between the “full” model 1 + A + B + A&B and the “null” model 1 + B + A&B. The interaction is different if you define the contrasts for B differently. Generally it doesn’t make sense to test the significance of a main effect in the presence of a non-negligible interaction but people do it every day.

With random effects in the model things become even more complicated because the orthogonal decomposition of the response vector used in a linear model no longer applies. The fitted values are no longer obtained by projection onto the column span of the model matrix, which is what the least squares estimates are because they minimize the distance from the observed value to the column span of the model matrix. Random effects have a regularization term and their estimates are “shrunken” towards zero. That is why the test used in the MixedModels package is the likelihood ratio test.


BTW, @dmbates, if you implement the LRT in MixedModels, it would be nice to define the function in StatsBase to unify the interfaces, and later add support for GLMs and other kinds of models (following what we recently did with ftest in GLM.jl).

1 Like

See section 5 of Bill Venables’ famous unpublished paper, Exegeses on linear models for a much better explanation of the silliness than ensues when one doesn’t pay attention to the models being tested.


@nalimilan Adding a likelihood ratio test generic to StatsBase would be a fine idea, although some bikeshedding of the name may be desirable. I wrote the one in MixedModels in a hurry because a particular user needed it.

Thanks for the explanation and reference to the paper