Implement Mixed Models with sparse X and Z matrices

Antonina_Klyuyeva · November 3, 2020, 4:28pm

Hi there,
I use mixed models on a large file (500000 rows).
My model formula looks like this:
Y ~ 0 + num1:factor1 + num1:factor2 + num2:factor3 + factor4 + (0 + num3|subject) + (0 + num4|subject) + (1|subject),
where num - numeric variables; factor - categorical variables/factors.

Since categorical variables have many unique levels, the fixed effects matrix is very sparse (sparsity ~0.9).
Fitting such a matrix if it is handle as dense requires a lot of time and RAM.

I had the same problem with linear regression.
My dense matrix was 20GB, but when I converted it to sparse it became only 35MB.
So, I implemented regression in R using following functions:

sparse.model.matrix (to create a sparse model/design matrix) and
MatrixModels:::lm.fit.sparse (to fit a sparse matrix and calculate coefficients).

Can I apply a similar approach to mixed models and realised it using Julia packages?
What functions / packages can I use to implement this?

That is, my main question is whether it is possible to implement mixed models with sparse matrices?
What package/functions should I use to create X and Z sparse model matrices?
Then, which function should I use for fitting the model with sparse matrices to get coefficients?

I would be very-very grateful for any help with this!

dmbates · November 3, 2020, 5:49pm

There is some provision in the MixedModels package for working with sparse model matrices for the fixed-effects parameters. I haven’t tried it out myself and am not sure how well it is integrated with the StatsModels package which does the conversion from formula/data to model matrices. Perhaps @palday or @dave.f.kleinschmidt may be able to provide more detail on how easy or difficult it would be.

Juan · November 3, 2020, 7:07pm

You should try with
https://github.com/FixedEffects/FixedEffectModels.jl

Antonina_Klyuyeva · November 3, 2020, 7:34pm

@Juan Does this package can handle mixed model formulas (I mean, random effects part)?

Antonina_Klyuyeva · November 3, 2020, 7:39pm

@dmbates Thank you for the answer, I’m using the MixedModels package now, but it looks like it handles X matrix as dense, since model fitting consumes enormous RAM.

palday · November 3, 2020, 7:40pm

FixedEffectModels doesn’t – and note that “fixed effects” in econometrics has a different meaning that elsewhere (it’s “categorical fixed effect” in the terminology in MixedModels).

The Z matrices in MixedModels are already sparse (that’s about half the magic of the MixedModels approach compared – @dmbates developed a way to express fitting as a sparse penalized least squares problem, while most techniques depend on a dense generalized least squares problem). The X matrix can be sparse, but the formula interface won’t generate it. If you call the LinearMixedModel constructor directly with a sparse X that you constructed by hand, it will work though. The support for sparse FeMat was one of the changes in MixedModels 3.0

palday · November 3, 2020, 7:42pm

Relevant (merged) pull request: https://github.com/JuliaStats/MixedModels.jl/pull/309

This functionality is very new and the people using it so far are close collaborators, so we haven’t yet written good documentation on it.

EDIT:

One more tip: use Grouping() pseudo-contrasts when you have many levels of the grouping variable: contrasts=Dict(:subject => Grouping())

palday · November 3, 2020, 7:48pm

I should also emphasize @Antonina_Klyuyeva that if you run into problems but have a good minimal working example then we’re (well, I) happy to help. And a good MWE is great for improving out support and expanding our tests. Note that your MWE can also include real data, if you’re able to share that.

Topic		Replies	Views
How to obtain numeric arrays from a linear mixed model formula? Statistics question	11	1193	June 11, 2020
Any Julia's equivalent to R's packages mcgv or mixed-effects models larger than memory? Statistics	9	2952	November 19, 2018
Getting started with mixed models Data	4	689	June 24, 2020
Is there any glmmTMB package for Julia? Statistics	15	2517	April 28, 2022
Mixed multi-membership models in Julia Statistics	4	614	June 24, 2020

Implement Mixed Models with sparse X and Z matrices

Related topics