How to fit MixedModels with the option REML=TRUE?


#1

I need to use the package MixedModels.jl for regression with random effects.

I’ve created the following toy example.

using DataFrames, MixedModels, StatsBase, Distributions, LinearAlgebra, SparseArrays  

N=50
x1 = repeat(1:N, outer=N);
x2 = repeat(1:N, inner=N);
x3 = sqrt.(1:N^2);
gg = repeat(1:5,inner=trunc(Int,N^2/5));
y = 1 .- 2*x1 + 3*x2 + 0.5*x1.*x2 + rand(N^2) + x3.*rand(N^2);
data = DataFrame(x1=x1, x2=x2, x3=x3, x1x2=x1x2, y=y,gg=gg)
categorical!(data, :gg);

It can be fit with this command:

fit(LinearMixedModel, @formula(y ~x1+x2+x3+x1&x2 + (1|gg)), data)

julia> fit(LinearMixedModel, @formula(y ~x1+x2+x3+x1&x2 + (1|gg)), data)
Linear mixed model fit by maximum likelihood
 Formula: y ~ x1 + x2 + x3 + x1 & x2 + (1 | gg)
     logLik        -2 logLik          AIC             BIC
 -9.34933576×10³  1.86986715×10⁴  1.87126715×10⁴  1.87534398×10⁴

Variance components:
              Column    Variance   Std.Dev.
 gg       (Intercept)    0.000000  0.000000
 Residual              103.709274 10.183775
 Number of obs: 2500; levels of grouping factors: 5

  Fixed-effects parameters:
             Estimate   Std.Error  z value P(>|z|)
(Intercept)   2.03282     1.33776  1.51957  0.1286
x1           -2.01012   0.0288441 -69.6893  <1e-99
x2            3.09139   0.0766062  40.3543  <1e-99
x3           0.440691    0.086853  5.07399   <1e-6
x1 & x2      0.499277 0.000980049  509.441  <1e-99

But I need it to use the REML method instead of the maximum likelihood method. How can I tell it to do it?

On R’s lme4 you can switch between these methods with the option REML=TRUE or FALSE, but I can’t find anything like that on MixedModels docs.

Another simpler question, do I need to load all this packages: DataFrames, MixedModels, StatsBase, Distributions, LinearAlgebra, SparseArrays?
I’ve seen it at some examples but I don’t know it they are all needed for this simple examples (or logit regression) or they would be automatically loaded by MixedModels.jl.


#2

Re question 2: No, you don’t. You’ll also need to be using the packages that export functions/constructors you want to use. In your case that’s the call to DataFrame(...) which requires using DataFrames and the call to fit(...) which required using MixedModels


#3

You can do using StatsKit, that should give you the packages you need for this.
REML fitting is not implemented in the Julia version. I asked about this recently on Slack, and @dmbates didn’t seem to think it was very difficult to implement, but not very necessary either. https://julialang.slack.com/archives/C6821M4KE/p1546889209072700?thread_ts=1546864137.071500&cid=C6821M4KE


#4

Try the REML branch of the MixedModels package. i.e. in the package REPL use

(1.1) pkg> add MixedModels#REML

To fit with REML use the optional named argument, REML = true in the call to fit!, as in

julia> fm1 = fit!(LinearMixedModel(@formula(Y ~ 1 + (1|G)), dat[:Dyestuff]), REML=true)
Linear mixed model fit by REML
 Formula: Y ~ 1 + (1 | G)
 REML criterion at convergence: 319.6542768422514

Variance components:
              Column    Variance  Std.Dev.
 G        (Intercept)  1764.0504 42.000600
 Residual              2451.2499 49.510099
 Number of obs: 30; levels of grouping factors: 6

  Fixed-effects parameters:
             Estimate Std.Error z value P(>|z|)
(Intercept)    1527.5   19.3834 78.8045  <1e-99

I just committed this a few minutes ago and I haven’t added tests yet but, on very simple examples like this, it seems to be working as expected.


#5

I’ll try, thank you.
Anyway you said we shouldn’t use it with logistic regressions, isn’t it?
Then what should be use if we want to study the random terms?


#6

Maximum Likelihood