Diagnosing "Fixed-effects matrix is rank-deficient"

I am using MixedModels fit(@formula(y ~ a + b + c + d + e), data) and it says “Warning: Fixed-effects matrix is rank-deficient”. I want to diagnose the problem. Can I find out which combinations of columns are not full rank? For example, is it a single categorical variable c that has only one value, or a pair of columns a and b that are dependent? Can I easily extract this information from the formula and data?


cc @dmbates @palday

The fundamental problem and why a universal solution is difficult is discussed in the docs.

You can look at the numerical rank of the model matrix to get an idea about how many “extra” columns there are, but that doesn’t tell you how many extra terms you have in your formula. Using orthogonal contrasts for categorical variables may help with numerical rank.

In practice, you can also see which predictors were dropped as part of the automatic attempts to handle rank deficiency – their estimate will be -0.0 (note the negative sign) and their standard error will be NaN.

1 Like

Alternatively, the evaluated rank, rank, and the pivot vector, piv, can be retrieved from the feterm (fixed-effects term) object. In the current release (v3.8.0) this is the first element of the feterms field in the fitted model

julia> first(m1.feterms).rank
32

julia> show(first(m1.feterms).piv)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32]

(When the fixed-effects model matrix is deemed full rank the pivot vector is always 1:size(X, 2).)

In the development version of MixedModels, v4.0.0-DEV, replace first(m1.feterms) by m1.feterm.

1 Like