Random effect MixedModels

ktsiamasiotis · July 3, 2018, 5:54am

Hi,
I used MixedModels package a couple of times in the past as a faster alternative to R’s lmer. Recently I downloaded latest official Julia version (0.6.3) and tried with some datasets. I compared the results with old MixedModels and Julia version, and I found that random effects are not matching at all (using ranef(lmeModel, named=true)). Digging little deeper i managed to match them using the following snippet:

randomEff = transpose(ranef(lmeModel, named=true)[1])
rows, columns = names(randomEff)
randomDF = DataFrame(randomEff)
names!(randomEff.colindex, map(parse, columns))
randomEff[:store] = unique(lmeModel.trms[1].f)
dfCols = randomEff.colindex.names
randomEff = stack(randomEff,dfCols[1:end-1], :store)

Now random effects appear very close to old results.
The question is: Am I doing something wrong or there is an issue with random effects in new version?
Please note that I have very little experience with Julia.

dmbates · July 3, 2018, 2:32pm

This is https://github.com/dmbates/MixedModels.jl/issues/130 which I hope to address today.

dmbates · July 3, 2018, 5:01pm

I am having difficulty reproducing the problem and would appreciate a reproducible example. Either I have somehow introduced a reordering or I misunderstand the relationship between the refs and the levels of a CategoricalArray.

I may need help from @nalimilan if it is a misunderstanding of the innards of the CategoricalArrays package. For some reason I am still confused by the documentation and the code in that package. The fact that this hinges on the order of unique(lmeModel.trms[1].f) is very, very strange to me.

nalimilan · July 3, 2018, 5:37pm

I guess that’s due to the fact that you use refs as if its values corresponded to the order in which levels are presented, while it actually refers to the internal order of the levels (index). You need to use CategoricalArrays.order(x.pool)[x.refs] to get codes corresponding to user-visible ordering of levels.

dmbates · July 3, 2018, 5:55pm

Okay, thanks.

I must admit that I find that nomenclature highly unintuitive. To me it is axiomatic that for categorical data the relationship

levels(f)[f.refs] == f

should hold.

Would you be willing to redefine the result of levels(f) so that this relationship does hold? Right now I don’t understand what the purpose of having a levels function is if the result cannot be indexed by the refs.

dmbates · July 3, 2018, 6:05pm

Alternatively, would you be willing to add another extractor function, say reflevels, that provides the levels that can be indexed by the refs? I don’t object to having internal representations that allows for fast reordering, etc. In my application I want to work with the refs vector because the, potentially very large, model matrices for the random effects are generated from indicators of the refs. So really this problem for me is comparatively minor if I can just find out how to label the indicator columns.

nalimilan · July 3, 2018, 6:12pm

Yeah, I know it’s somewhat confusing. The point of this level of indirection is that things like CategoricalArray(["c", "b", "a"]) can be coded on the fly without resetting all the refs at the end, once we realize that levels(x) == ["a", "b", "c"] doesn’t match the order of appearance (which is used for refs since we cannot know the order beforehand). It also allows changing the order of levels without modifying the underlying data.

The purpose levels is to return the levels in the order the user wants to see them. What you seem to be interested in is rather the internal order of levels, which is CategoricalArrays.index(x.pool). You can always rely on the property that CategoricalArrays.index(x.pool)[x.refs] == x.

EDIT: I’ve added a warning in the CategoricalArrays manual: https://github.com/JuliaData/CategoricalArrays.jl/pull/152

Topic		Replies	Views
Dictionary for random effects in MultiLevel.jl model General Usage	1	413	June 13, 2023
Factors for regression models in julia Statistics question , package	5	1074	May 26, 2021
How to obtain numeric arrays from a linear mixed model formula? Statistics question	11	1193	June 11, 2020
MixedModels and Random Interactions Statistics question , package , statistics , error	6	1553	August 17, 2017
Mixed multi-membership models in Julia Statistics	4	614	June 24, 2020

Random effect MixedModels

Related topics