How to fit MixedModels with the option REML=TRUE?

I need to use the package MixedModels.jl for regression with random effects.

I’ve created the following toy example.

using DataFrames, MixedModels, StatsBase, Distributions, LinearAlgebra, SparseArrays  

N=50
x1 = repeat(1:N, outer=N);
x2 = repeat(1:N, inner=N);
x3 = sqrt.(1:N^2);
gg = repeat(1:5,inner=trunc(Int,N^2/5));
y = 1 .- 2*x1 + 3*x2 + 0.5*x1.*x2 + rand(N^2) + x3.*rand(N^2);
data = DataFrame(x1=x1, x2=x2, x3=x3, x1x2=x1x2, y=y,gg=gg)
categorical!(data, :gg);

It can be fit with this command:

fit(LinearMixedModel, @formula(y ~x1+x2+x3+x1&x2 + (1|gg)), data)

julia> fit(LinearMixedModel, @formula(y ~x1+x2+x3+x1&x2 + (1|gg)), data)
Linear mixed model fit by maximum likelihood
 Formula: y ~ x1 + x2 + x3 + x1 & x2 + (1 | gg)
     logLik        -2 logLik          AIC             BIC
 -9.34933576×10³  1.86986715×10⁴  1.87126715×10⁴  1.87534398×10⁴

Variance components:
              Column    Variance   Std.Dev.
 gg       (Intercept)    0.000000  0.000000
 Residual              103.709274 10.183775
 Number of obs: 2500; levels of grouping factors: 5

  Fixed-effects parameters:
             Estimate   Std.Error  z value P(>|z|)
(Intercept)   2.03282     1.33776  1.51957  0.1286
x1           -2.01012   0.0288441 -69.6893  <1e-99
x2            3.09139   0.0766062  40.3543  <1e-99
x3           0.440691    0.086853  5.07399   <1e-6
x1 & x2      0.499277 0.000980049  509.441  <1e-99

But I need it to use the REML method instead of the maximum likelihood method. How can I tell it to do it?

On R’s lme4 you can switch between these methods with the option REML=TRUE or FALSE, but I can’t find anything like that on MixedModels docs.

Another simpler question, do I need to load all this packages: DataFrames, MixedModels, StatsBase, Distributions, LinearAlgebra, SparseArrays?
I’ve seen it at some examples but I don’t know it they are all needed for this simple examples (or logit regression) or they would be automatically loaded by MixedModels.jl.

Re question 2: No, you don’t. You’ll also need to be using the packages that export functions/constructors you want to use. In your case that’s the call to DataFrame(...) which requires using DataFrames and the call to fit(...) which required using MixedModels

1 Like

You can do using StatsKit, that should give you the packages you need for this.
REML fitting is not implemented in the Julia version. I asked about this recently on Slack, and @dmbates didn’t seem to think it was very difficult to implement, but not very necessary either. Slack

3 Likes

Try the REML branch of the MixedModels package. i.e. in the package REPL use

(1.1) pkg> add MixedModels#REML

To fit with REML use the optional named argument, REML = true in the call to fit!, as in

julia> fm1 = fit!(LinearMixedModel(@formula(Y ~ 1 + (1|G)), dat[:Dyestuff]), REML=true)
Linear mixed model fit by REML
 Formula: Y ~ 1 + (1 | G)
 REML criterion at convergence: 319.6542768422514

Variance components:
              Column    Variance  Std.Dev.
 G        (Intercept)  1764.0504 42.000600
 Residual              2451.2499 49.510099
 Number of obs: 30; levels of grouping factors: 6

  Fixed-effects parameters:
             Estimate Std.Error z value P(>|z|)
(Intercept)    1527.5   19.3834 78.8045  <1e-99

I just committed this a few minutes ago and I haven’t added tests yet but, on very simple examples like this, it seems to be working as expected.

3 Likes

I’ll try, thank you.
Anyway you said we shouldn’t use it with logistic regressions, isn’t it?
Then what should be use if we want to study the random terms?

Maximum Likelihood