How to add weights parameter to Generalized Mixed Model

I’m not sure. The results seem quite different.

R:
summary(glmer(formula = Survived ~ 1 + Age * Sex + (1|Class), family = binomial, weights = Freq, nAGQ = 2, data = as.data.frame(Titanic)))

Julia
MixedModels.fit(GeneralizedLinearMixedModel, @formula(surv_flg ~ 1 + Age + Sex + Age & Sex + (1 | Class)), titanic[titanic.Freq .> 0,:], wts = titanic[titanic.Freq .> 0,"Freq"], Bernoulli(), nAGQ = 3)

image

julia> form = @formula(Survived ~ 1 + Age*Sex + (1|Class));

julia> nonzero = filter(:Freq => !iszero, titanic);

julia> m = fit(GLMM, form, nonzero, Bernoulli(), wts=nonzero.Freq, nAGQ=3)
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 3)
  Survived ~ 1 + Age + Sex + Age & Sex + (1 | Class)
  Distribution: Bernoulli{Float64}
  Link: LogitLink()

  Deviance: 2210.0492

Variance components:
         Column    Variance   Std.Dev.  
Class (Intercept)  0.40040641 0.63277675

 Number of obs: 24; levels of grouping factors: 4

Fixed-effects parameters:
──────────────────────────────────────────────────────────────
                         Estimate  Std.Error  z value  P(>|z|)
──────────────────────────────────────────────────────────────
(Intercept)              1.26181    0.339321     3.72   0.0002
Age: Child              -0.123581   0.335105    -0.37   0.7123
Sex: Male               -2.61007    0.14972    -17.43   <1e-67
Age: Child & Sex: Male   1.89475    0.432308     4.38   <1e-4
──────────────────────────────────────────────────────────────

To get the same answers as the call to glmer you would need to change the order of the levels on Age and Sex.

1 Like

#facepalm

Thanks for your help/patience.

The /beta coefficients line up now.
Followup: Any idea why the std.error, z-value, and p-values don’t match?

as.data.frame(Titanic) %>% 
    mutate(Age = relevel(x = Age, ref = "Adult"),
           Sex = relevel(x = Sex, ref = "Female")) %>%
    glmer(formula = Survived ~ 1 + Age * Sex + (1|Class), family = binomial, weights = Freq, nAGQ = 3, data = .) %>% 
    summary()

The standard errors, z-values and p-values match more-or-less with the results I posted. Keep in mind that these are even more approximate for a GLMM than for a LMM so “matching results” should be done with a rather wide tolerance.

@dmbates
I must be really dense today or something is off in my install (Win10-64).

Using Julia 1.4.2
image

I updated MixedModels
image

and ran your initial example code:
image

which for me yields
image

and not your results
image

I’ve gotten the same results with both the Julia CL and Atom IDE.

Do you have any suggestions what the problem might be?

I was using MixedModels#master as in

(@v1.4) pkg> add MixedModels#master

It seems that the coefficients are properly estimated but the standard deviation of the random effects and the standard errors are incorrect. @palday Do you recall if there was an issue related to weights in the released version v2.3.0? You have been doing much more on GLMMs while I have been down the automatic differentiation rabbit hole.

I was unable to reproduce the incorrect fit with MixedModels 2.3.0 – neither with nAGQ=1 nor nAGQ=3.

I tried to rule out the user hardware error by using a JuliaBox but ran into issues with adding the package :disappointed_relieved:

The standard error issue that I’m aware of never made it into a release – there have been some intermittent issues with weights but I don’t think they’re related.

One thing I saw when skimming the thread is that there was one fit at the top with nAGQ > 1 and fast=true. This corresponds to a configuration that simply isn’t possible in lme4, namely AGQ but while optimizing on only the random (and not the fixed) effects. In lme4, you get the Laplace approximation (nAGQ=1) when you optimize on only the RE (confusingly with the option nAGQ=0).

A cliche’d suggestion: can you run locally, but after restarting julia? So restart julia, do the updates, then do the fitting. Don’t update while a package is loaded.

I’ve been working on a RCall compatibility for MixedModels <=> lme4 in my package JellyMe4 (it’s on the Julia registry). The latest release won’t move glmer to GeneralizedMixedModel but should be able to move Bernoulli models in the other direction (and will preserve contrasts). The funny thing there is that I’m able to reproduce @laut 's fit that way:

julia> using RCall, JellyMe4
[ Info: Precompiling RCall [6f49c342-dc21-5d91-9882-a32aef131414]
[ Info: Precompiling JellyMe4 [19ac8677-9a15-4623-9afd-84acc6165ce7]

julia> model = (m, nonzero);
julia> @rput model;
[ Info: contrasts on interaction terms are assumed to decompose into products of contrasts on the first-order terms
[ Info: (if you don't know what that means, you're probably fine)
┌ Warning: Some accuracy is lost in translation for GLMMs. This is fine with plotting, but proceed with caution for inferences.
│ You can try letting the model "reconverge" in lme4 with
│     update(model, control=glmerControl(optimizer="nloptwrap", calc.derivs=FALSE)).
└ @ JellyMe4 ~/.julia/packages/JellyMe4/Hupt7/src/glmerMod.jl:144
[ Info: lme4 handles deviance for GLMMs differently than MixedModels.jl
[ Info: for the correct comparison, examine -2*logLik(RModel) and deviance(JuliaModel)

julia> R"summary(model)"
RObject{VecSxp}
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Survived ~ 1 + Age + Sex + Age:Sex + (1 | Class)
   Data: jellyme4_data
Control: glmerControl(optimizer = "nloptwrap", optCtrl = list(maxeval = 1),  
    calc.derivs = FALSE)

     AIC      BIC   logLik deviance df.resid 
    49.0     54.9    -19.5     39.0       19 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.0631 -0.7030  0.5001  0.7373  2.1385 

Random effects:
 Groups Name        Variance Std.Dev.
 Class  (Intercept) 0.4006   0.6329  
Number of obs: 24, groups:  Class, 4

Fixed effects:
                 Estimate Std. Error z value Pr(>|z|)  
(Intercept)        1.2618     0.9218   1.369   0.1710  
AgeChild          -0.1235     1.4687  -0.084   0.9330  
SexMale           -2.6101     1.2214  -2.137   0.0326 *
AgeChild:SexMale   1.8948     1.9781   0.958   0.3381  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr) AgChld SexMal
AgeChild    -0.552              
SexMale     -0.666  0.416       
AgChld:SxMl  0.411 -0.728 -0.617

julia> m
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
  Survived ~ 1 + Age + Sex + Age & Sex + (1 | Class)
  Distribution: Bernoulli{Float64}
  Link: LogitLink()

  Deviance: 2210.0511

Variance components:
         Column   Variance Std.Dev.
Class (Intercept)  37.72355 6.14195

 Number of obs: 24; levels of grouping factors: 4

Fixed-effects parameters:
─────────────────────────────────────────────────────────────────
                         Estimate  Std.Error     z value  P(>|z|)
─────────────────────────────────────────────────────────────────
(Intercept)              1.26179     3.29347   0.383119    0.7016
Age: Child              -0.123528    3.25191  -0.0379863   0.9697
Sex: Male               -2.61011     1.45291  -1.79647     0.0724
Age: Child & Sex: Male   1.89476     4.19518   0.451653    0.6515
─────────────────────────────────────────────────────────────────

One thing to note is that weights don’t work in the current JellyMe4 release, so those are getting lost in translation. This suggests that something is happening with the weights in your bad run…

Just copying and pasting @dmbates code in the first answer, I’m also getting a different result in variance, Std.Dev and Std.errors estimates.

using RDatasets, MixedModels

titanic = RDatasets.dataset("datasets", "Titanic");

m = fit(GeneralizedLinearMixedModel, @formula(Survived ~ 1 + Age + Sex + (1|Class)), titanic, Binomial(), wts=float.(titanic.Freq))

m
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
  Survived ~ 1 + Age + Sex + (1 | Class)
  Distribution: Binomial{Float64}
  Link: LogitLink()

  Deviance: 2227.9565

Variance components:
         Column    Variance  Std.Dev. 
Class (Intercept)  26.841378 5.1808665

 Number of obs: 32; levels of grouping factors: 4

Fixed-effects parameters:
────────────────────────────────────────────────────
             Estimate  Std.Error    z value  P(>|z|)
────────────────────────────────────────────────────
(Intercept)   1.12409    2.76454   0.406611   0.6843
Age: Child    1.04558    2.03346   0.514188   0.6071
Sex: Male    -2.41575    1.16525  -2.07317    0.0382
────────────────────────────────────────────────────

I’m on MixedModels v2.3.0

I’m reasonably sure that’s the correct answer (because that’s the only one that make sense in terms of other things we know are going on).

That said, there are only 4 levels of class, so we’re already really pushing the limits of what we should be fitting. After all, we know that variance estimates (outside of a mixed models context) based on 4 observations generally won’t be great.

@palday I’ll try that now, I just tested that MixedModels doesn’t load on JuliaBox’s VM’s on
1.0.5

or Julia 1.3.1

This is my quick dirty way of testing reproducibility.

I’m not sure what’s going on there, but MixedModels 2.3.0 does have Julia 1.3 listed in its compatibility requirements, so it’s not surprising that it doesn’t run on earlier versions (in part because 1.3 has some threading and linear algebra enhancements that we take advantage of).

@palday but lme4 says that the results are statistically significant.

I think because when weighted the properly sample size is 2,201
sum(as.data.frame(Titanic)$Freq)

My guess is that the optimum isn’t well defined and that we’re seeing two local optima. I don’t have time at the moment to check the profiles (the code for this isn’t finalized in MixedModels.jl and I – and I suspect @dmbates too – have spent too much time in Julia lately to remember how to produce a profile plot quickly there). Because looking at the random effects – which is where the magic happens – we’re seeing approximately the same two answers again and again. Given that there are only 4 groups, this isn’t too surprising, but I would have to double check.

If I de-tabulate the tabulated Titanic dataset so that each row represents an individual person the results change:

I don’t know why. I understood the weights parameter to be intended for just this purpose.
See:

and:

It produces the exact same [except residuals] results on a GLM but not GLMER.

If I use the de-tabulated data in Julia:
R
write.csv(x = tidyr::uncount(data = data.frame(Titanic), weights = data.frame(Titanic)$Freq)[,-5], file = "C://Users/laut/Desktop/titanic.tsv")
Julia
titanic2 = CSV.File("C:\\Users\\laut\\Desktop\\titanic.tsv"; delim = "\t", header = true) |> DataFrame;
MixedModels.fit(GeneralizedLinearMixedModel, @formula(Survived ~ 1 + Age * Sex + (1 | Class)), titanic2, Bernoulli(), nAGQ = 1)
image