How to add weights parameter to Generalized Mixed Model

How do you add the weight of an observation to a Mixed Model? I thought I could add the Freq column to wt argument, but apparently not.

``````using RDatasets MixedModels
titanic = RDatasets.dataset("datasets", "Titanic")
titanic.surv_flg = titanic.Survived .== "Yes";
``````

This runs:

``````MixedModels.fit(GeneralizedLinearMixedModel, @formula(surv_flg ~ 1 + Age + Sex + (1 | Class)), titanic, Bernoulli(), nAGQ = 2, fast = true)
``````

But this doesnβt

``````MixedModels.fit(GeneralizedLinearMixedModel, @formula(surv_flg ~ 1 + Age * Sex + (1 | Class)), titanic, wt = Freq, Bernoulli(), nAGQ = 2, fast = true)
``````

The short answer is that the argument name is `wts`, not `wt`. Using `MixedModels#master` under Julia-1.4.2 I get

``````julia> using RDatasets, MixedModels
[ Info: Precompiling MixedModels [ff71e718-51f3-5ec2-a782-8ffcbfa3c316]

julia> titanic = RDatasets.dataset("datasets", "Titanic");

julia> m = fit(GeneralizedLinearMixedModel, @formula(Survived ~ 1 + Age + Sex + (1|Class)), titanic, Binomial(), wts=float.(titanic.Freq))
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
Survived ~ 1 + Age + Sex + (1 | Class)
Distribution: Binomial{Float64}

Deviance: 2227.9565

Variance components:
Column    Variance  Std.Dev.
Class (Intercept)  0.38409626 0.619755

Number of obs: 32; levels of grouping factors: 4

Fixed-effects parameters:
ββββββββββββββββββββββββββββββββββββββββββββββββββ
Estimate  Std.Error  z value  P(>|z|)
ββββββββββββββββββββββββββββββββββββββββββββββββββ
(Intercept)   1.1241    0.330702     3.40   0.0007
Age: Child    1.04557   0.243236     4.30   <1e-4
Sex: Male    -2.41576   0.139383   -17.33   <1e-66
ββββββββββββββββββββββββββββββββββββββββββββββββββ
``````

Looks like docs need to be updated:

``````GeneralizedLinearMixedModel
``````

Generalized linear mixed-effects model representation

Fields

• `LMM` : a `LinearMixedModel` - the local approximation to the GLMM.
• `Ξ²` : the pivoted and possibly truncated fixed-effects vector
• `Ξ²β` : similar to `Ξ²` . Used in the PIRLS algorithm if step-halving is needed.
• `ΞΈ` : covariance parameter vector
• `b` : similar to `u` , equivalent to `broadcast!(*, b, LMM.Ξ, u)`
• `u` : a vector of matrices of random effects
• `uβ` : similar to `u` . Used in the PIRLS algorithm if step-halving is needed.
• `resp` : a `GlmResp` object
• `Ξ·` : the linear predictor
• `wt` : vector of prior case weights, a value of `T[]` indicates equal weights.

Notice that the name of the `wts` column must be fully qualified and it should be converted with `float.`.

I was surprised that this worked because of the rows with zero frequency. The typical way of coding contingency-table data like this for fitting a GLM or GLMM with a `Binomial` distribution is to combine the `No/Yes` responses into a single row where the response is the proportion of `Yes` and the weights are the total of `Yes` and `No`. I think that a clever `DataFrames.group_by` call, perhaps with `select` or `transform` should achieve that.

1 Like

You are right, that should be changed, but it is more a matter of inconsistency than incorrect documentation. The field is named `wt` but the argument in the `fit` method is named `wts`.

My favorite Oscar Wilde quote is βConsistency is the last refuge of the unimaginativeβ.

I was looking for something similar to Rβs `weights`
`glmer(formula = Survived ~ 1 + Age + Sex + (1|Class), family = binomial, weights = Freq, nAGQ = 2, data = as.data.frame(Titanic))`

Isnβt that what I wrote?

``````julia> const GLMM = GeneralizedLinearMixedModel;

julia> form = @formula(Survived ~ 1 + Age + Sex + (1|Class));

julia> m = fit(GLMM, form, titanic, Bernoulli(), wts=float.(titanic.Freq), nAGQ=3)
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 3)
Survived ~ 1 + Age + Sex + (1 | Class)
Distribution: Bernoulli{Float64}

Deviance: 2227.9546

Variance components:
Column    Variance  Std.Dev.
Class (Intercept)  0.3841519 0.6197999

Number of obs: 32; levels of grouping factors: 4

Fixed-effects parameters:
ββββββββββββββββββββββββββββββββββββββββββββββββββ
Estimate  Std.Error  z value  P(>|z|)
ββββββββββββββββββββββββββββββββββββββββββββββββββ
(Intercept)   1.12414   0.330723     3.40   0.0007
Age: Child    1.04558   0.243237     4.30   <1e-4
Sex: Male    -2.41577   0.139384   -17.33   <1e-66
ββββββββββββββββββββββββββββββββββββββββββββββββββ

``````

I use nAGQ=3 instead of 2 because you always evaluate the midpoint so you might as well use the next odd number.

1 Like

Iβm not sure. The results seem quite different.

R:
`summary(glmer(formula = Survived ~ 1 + Age * Sex + (1|Class), family = binomial, weights = Freq, nAGQ = 2, data = as.data.frame(Titanic)))`

Julia
`MixedModels.fit(GeneralizedLinearMixedModel, @formula(surv_flg ~ 1 + Age + Sex + Age & Sex + (1 | Class)), titanic[titanic.Freq .> 0,:], wts = titanic[titanic.Freq .> 0,"Freq"], Bernoulli(), nAGQ = 3)`

``````julia> form = @formula(Survived ~ 1 + Age*Sex + (1|Class));

julia> nonzero = filter(:Freq => !iszero, titanic);

julia> m = fit(GLMM, form, nonzero, Bernoulli(), wts=nonzero.Freq, nAGQ=3)
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 3)
Survived ~ 1 + Age + Sex + Age & Sex + (1 | Class)
Distribution: Bernoulli{Float64}

Deviance: 2210.0492

Variance components:
Column    Variance   Std.Dev.
Class (Intercept)  0.40040641 0.63277675

Number of obs: 24; levels of grouping factors: 4

Fixed-effects parameters:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Estimate  Std.Error  z value  P(>|z|)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
(Intercept)              1.26181    0.339321     3.72   0.0002
Age: Child              -0.123581   0.335105    -0.37   0.7123
Sex: Male               -2.61007    0.14972    -17.43   <1e-67
Age: Child & Sex: Male   1.89475    0.432308     4.38   <1e-4
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
``````

To get the same answers as the call to `glmer` you would need to change the order of the levels on `Age` and `Sex`.

1 Like

#facepalm

The /beta coefficients line up now.
Followup: Any idea why the std.error, z-value, and p-values donβt match?

``````as.data.frame(Titanic) %>%
mutate(Age = relevel(x = Age, ref = "Adult"),
Sex = relevel(x = Sex, ref = "Female")) %>%
glmer(formula = Survived ~ 1 + Age * Sex + (1|Class), family = binomial, weights = Freq, nAGQ = 3, data = .) %>%
summary()
``````

The standard errors, z-values and p-values match more-or-less with the results I posted. Keep in mind that these are even more approximate for a GLMM than for a LMM so βmatching resultsβ should be done with a rather wide tolerance.

@dmbates
I must be really dense today or something is off in my install (Win10-64).

Using Julia 1.4.2

I updated `MixedModels`

and ran your initial example code:

which for me yields

Iβve gotten the same results with both the Julia CL and Atom IDE.

Do you have any suggestions what the problem might be?

I was using MixedModels#master as in

``````(@v1.4) pkg> add MixedModels#master
``````

It seems that the coefficients are properly estimated but the standard deviation of the random effects and the standard errors are incorrect. @palday Do you recall if there was an issue related to weights in the released version v2.3.0? You have been doing much more on GLMMs while I have been down the automatic differentiation rabbit hole.

I was unable to reproduce the incorrect fit with MixedModels 2.3.0 β neither with `nAGQ=1` nor `nAGQ=3`.

I tried to rule out the user hardware error by using a JuliaBox but ran into issues with adding the package

The standard error issue that Iβm aware of never made it into a release β there have been some intermittent issues with weights but I donβt think theyβre related.

One thing I saw when skimming the thread is that there was one fit at the top with `nAGQ > 1` and `fast=true`. This corresponds to a configuration that simply isnβt possible in lme4, namely AGQ but while optimizing on only the random (and not the fixed) effects. In lme4, you get the Laplace approximation (`nAGQ=1`) when you optimize on only the RE (confusingly with the option `nAGQ=0`).

A clicheβd suggestion: can you run locally, but after restarting julia? So restart julia, do the updates, then do the fitting. Donβt update while a package is loaded.

Iβve been working on a RCall compatibility for MixedModels <=> lme4 in my package JellyMe4 (itβs on the Julia registry). The latest release wonβt move `glmer` to `GeneralizedMixedModel` but should be able to move Bernoulli models in the other direction (and will preserve contrasts). The funny thing there is that Iβm able to reproduce @laut 's fit that way:

``````julia> using RCall, JellyMe4
[ Info: Precompiling RCall [6f49c342-dc21-5d91-9882-a32aef131414]
[ Info: Precompiling JellyMe4 [19ac8677-9a15-4623-9afd-84acc6165ce7]

julia> model = (m, nonzero);
julia> @rput model;
[ Info: contrasts on interaction terms are assumed to decompose into products of contrasts on the first-order terms
[ Info: (if you don't know what that means, you're probably fine)
β Warning: Some accuracy is lost in translation for GLMMs. This is fine with plotting, but proceed with caution for inferences.
β You can try letting the model "reconverge" in lme4 with
β     update(model, control=glmerControl(optimizer="nloptwrap", calc.derivs=FALSE)).
β @ JellyMe4 ~/.julia/packages/JellyMe4/Hupt7/src/glmerMod.jl:144
[ Info: lme4 handles deviance for GLMMs differently than MixedModels.jl
[ Info: for the correct comparison, examine -2*logLik(RModel) and deviance(JuliaModel)

julia> R"summary(model)"
RObject{VecSxp}
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial  ( logit )
Formula: Survived ~ 1 + Age + Sex + Age:Sex + (1 | Class)
Data: jellyme4_data
Control: glmerControl(optimizer = "nloptwrap", optCtrl = list(maxeval = 1),
calc.derivs = FALSE)

AIC      BIC   logLik deviance df.resid
49.0     54.9    -19.5     39.0       19

Scaled residuals:
Min      1Q  Median      3Q     Max
-2.0631 -0.7030  0.5001  0.7373  2.1385

Random effects:
Groups Name        Variance Std.Dev.
Class  (Intercept) 0.4006   0.6329
Number of obs: 24, groups:  Class, 4

Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept)        1.2618     0.9218   1.369   0.1710
AgeChild          -0.1235     1.4687  -0.084   0.9330
SexMale           -2.6101     1.2214  -2.137   0.0326 *
AgeChild:SexMale   1.8948     1.9781   0.958   0.3381
---
Signif. codes:  0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1

Correlation of Fixed Effects:
(Intr) AgChld SexMal
AgeChild    -0.552
SexMale     -0.666  0.416
AgChld:SxMl  0.411 -0.728 -0.617

julia> m
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
Survived ~ 1 + Age + Sex + Age & Sex + (1 | Class)
Distribution: Bernoulli{Float64}

Deviance: 2210.0511

Variance components:
Column   Variance Std.Dev.
Class (Intercept)  37.72355 6.14195

Number of obs: 24; levels of grouping factors: 4

Fixed-effects parameters:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Estimate  Std.Error     z value  P(>|z|)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
(Intercept)              1.26179     3.29347   0.383119    0.7016
Age: Child              -0.123528    3.25191  -0.0379863   0.9697
Sex: Male               -2.61011     1.45291  -1.79647     0.0724
Age: Child & Sex: Male   1.89476     4.19518   0.451653    0.6515
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
``````

One thing to note is that weights donβt work in the current JellyMe4 release, so those are getting lost in translation. This suggests that something is happening with the weights in your bad runβ¦