 # How to add weights parameter to Generalized Mixed Model

How do you add the weight of an observation to a Mixed Model? I thought I could add the Freq column to wt argument, but apparently not.

``````using RDatasets MixedModels
titanic = RDatasets.dataset("datasets", "Titanic")
titanic.surv_flg = titanic.Survived .== "Yes";
``````

This runs:

``````MixedModels.fit(GeneralizedLinearMixedModel, @formula(surv_flg ~ 1 + Age + Sex + (1 | Class)), titanic, Bernoulli(), nAGQ = 2, fast = true)
``````

But this doesn’t

``````MixedModels.fit(GeneralizedLinearMixedModel, @formula(surv_flg ~ 1 + Age * Sex + (1 | Class)), titanic, wt = Freq, Bernoulli(), nAGQ = 2, fast = true)
``````

The short answer is that the argument name is `wts`, not `wt`. Using `MixedModels#master` under Julia-1.4.2 I get

``````julia> using RDatasets, MixedModels
[ Info: Precompiling MixedModels [ff71e718-51f3-5ec2-a782-8ffcbfa3c316]

julia> titanic = RDatasets.dataset("datasets", "Titanic");

julia> m = fit(GeneralizedLinearMixedModel, @formula(Survived ~ 1 + Age + Sex + (1|Class)), titanic, Binomial(), wts=float.(titanic.Freq))
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
Survived ~ 1 + Age + Sex + (1 | Class)
Distribution: Binomial{Float64}

Deviance: 2227.9565

Variance components:
Column    Variance  Std.Dev.
Class (Intercept)  0.38409626 0.619755

Number of obs: 32; levels of grouping factors: 4

Fixed-effects parameters:
──────────────────────────────────────────────────
Estimate  Std.Error  z value  P(>|z|)
──────────────────────────────────────────────────
(Intercept)   1.1241    0.330702     3.40   0.0007
Age: Child    1.04557   0.243236     4.30   <1e-4
Sex: Male    -2.41576   0.139383   -17.33   <1e-66
──────────────────────────────────────────────────
``````

Looks like docs need to be updated:

``````GeneralizedLinearMixedModel
``````

Generalized linear mixed-effects model representation

Fields

• `LMM` : a `LinearMixedModel` - the local approximation to the GLMM.
• `β` : the pivoted and possibly truncated fixed-effects vector
• `β₀` : similar to `β` . Used in the PIRLS algorithm if step-halving is needed.
• `θ` : covariance parameter vector
• `b` : similar to `u` , equivalent to `broadcast!(*, b, LMM.Λ, u)`
• `u` : a vector of matrices of random effects
• `u₀` : similar to `u` . Used in the PIRLS algorithm if step-halving is needed.
• `resp` : a `GlmResp` object
• `η` : the linear predictor
• `wt` : vector of prior case weights, a value of `T[]` indicates equal weights.

Notice that the name of the `wts` column must be fully qualified and it should be converted with `float.`.

I was surprised that this worked because of the rows with zero frequency. The typical way of coding contingency-table data like this for fitting a GLM or GLMM with a `Binomial` distribution is to combine the `No/Yes` responses into a single row where the response is the proportion of `Yes` and the weights are the total of `Yes` and `No`. I think that a clever `DataFrames.group_by` call, perhaps with `select` or `transform` should achieve that.

1 Like

You are right, that should be changed, but it is more a matter of inconsistency than incorrect documentation. The field is named `wt` but the argument in the `fit` method is named `wts`.

My favorite Oscar Wilde quote is “Consistency is the last refuge of the unimaginative”.

I was looking for something similar to R’s `weights`
`glmer(formula = Survived ~ 1 + Age + Sex + (1|Class), family = binomial, weights = Freq, nAGQ = 2, data = as.data.frame(Titanic))`

Isn’t that what I wrote?

``````julia> const GLMM = GeneralizedLinearMixedModel;

julia> form = @formula(Survived ~ 1 + Age + Sex + (1|Class));

julia> m = fit(GLMM, form, titanic, Bernoulli(), wts=float.(titanic.Freq), nAGQ=3)
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 3)
Survived ~ 1 + Age + Sex + (1 | Class)
Distribution: Bernoulli{Float64}

Deviance: 2227.9546

Variance components:
Column    Variance  Std.Dev.
Class (Intercept)  0.3841519 0.6197999

Number of obs: 32; levels of grouping factors: 4

Fixed-effects parameters:
──────────────────────────────────────────────────
Estimate  Std.Error  z value  P(>|z|)
──────────────────────────────────────────────────
(Intercept)   1.12414   0.330723     3.40   0.0007
Age: Child    1.04558   0.243237     4.30   <1e-4
Sex: Male    -2.41577   0.139384   -17.33   <1e-66
──────────────────────────────────────────────────

``````

I use nAGQ=3 instead of 2 because you always evaluate the midpoint so you might as well use the next odd number.

1 Like

I’m not sure. The results seem quite different.

R:
`summary(glmer(formula = Survived ~ 1 + Age * Sex + (1|Class), family = binomial, weights = Freq, nAGQ = 2, data = as.data.frame(Titanic)))`

Julia
`MixedModels.fit(GeneralizedLinearMixedModel, @formula(surv_flg ~ 1 + Age + Sex + Age & Sex + (1 | Class)), titanic[titanic.Freq .> 0,:], wts = titanic[titanic.Freq .> 0,"Freq"], Bernoulli(), nAGQ = 3)` ``````julia> form = @formula(Survived ~ 1 + Age*Sex + (1|Class));

julia> nonzero = filter(:Freq => !iszero, titanic);

julia> m = fit(GLMM, form, nonzero, Bernoulli(), wts=nonzero.Freq, nAGQ=3)
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 3)
Survived ~ 1 + Age + Sex + Age & Sex + (1 | Class)
Distribution: Bernoulli{Float64}

Deviance: 2210.0492

Variance components:
Column    Variance   Std.Dev.
Class (Intercept)  0.40040641 0.63277675

Number of obs: 24; levels of grouping factors: 4

Fixed-effects parameters:
──────────────────────────────────────────────────────────────
Estimate  Std.Error  z value  P(>|z|)
──────────────────────────────────────────────────────────────
(Intercept)              1.26181    0.339321     3.72   0.0002
Age: Child              -0.123581   0.335105    -0.37   0.7123
Sex: Male               -2.61007    0.14972    -17.43   <1e-67
Age: Child & Sex: Male   1.89475    0.432308     4.38   <1e-4
──────────────────────────────────────────────────────────────
``````

To get the same answers as the call to `glmer` you would need to change the order of the levels on `Age` and `Sex`.

1 Like

#facepalm

The /beta coefficients line up now.
Followup: Any idea why the std.error, z-value, and p-values don’t match?

``````as.data.frame(Titanic) %>%
mutate(Age = relevel(x = Age, ref = "Adult"),
Sex = relevel(x = Sex, ref = "Female")) %>%
glmer(formula = Survived ~ 1 + Age * Sex + (1|Class), family = binomial, weights = Freq, nAGQ = 3, data = .) %>%
summary()
``````

The standard errors, z-values and p-values match more-or-less with the results I posted. Keep in mind that these are even more approximate for a GLMM than for a LMM so “matching results” should be done with a rather wide tolerance.

@dmbates
I must be really dense today or something is off in my install (Win10-64).

Using Julia 1.4.2 I updated `MixedModels` and ran your initial example code: which for me yields  I’ve gotten the same results with both the Julia CL and Atom IDE.

Do you have any suggestions what the problem might be?

I was using MixedModels#master as in

``````(@v1.4) pkg> add MixedModels#master
``````

It seems that the coefficients are properly estimated but the standard deviation of the random effects and the standard errors are incorrect. @palday Do you recall if there was an issue related to weights in the released version v2.3.0? You have been doing much more on GLMMs while I have been down the automatic differentiation rabbit hole.

I was unable to reproduce the incorrect fit with MixedModels 2.3.0 – neither with `nAGQ=1` nor `nAGQ=3`.

I tried to rule out the user hardware error by using a JuliaBox but ran into issues with adding the package The standard error issue that I’m aware of never made it into a release – there have been some intermittent issues with weights but I don’t think they’re related.

One thing I saw when skimming the thread is that there was one fit at the top with `nAGQ > 1` and `fast=true`. This corresponds to a configuration that simply isn’t possible in lme4, namely AGQ but while optimizing on only the random (and not the fixed) effects. In lme4, you get the Laplace approximation (`nAGQ=1`) when you optimize on only the RE (confusingly with the option `nAGQ=0`).

A cliche’d suggestion: can you run locally, but after restarting julia? So restart julia, do the updates, then do the fitting. Don’t update while a package is loaded.

I’ve been working on a RCall compatibility for MixedModels <=> lme4 in my package JellyMe4 (it’s on the Julia registry). The latest release won’t move `glmer` to `GeneralizedMixedModel` but should be able to move Bernoulli models in the other direction (and will preserve contrasts). The funny thing there is that I’m able to reproduce @laut 's fit that way:

``````julia> using RCall, JellyMe4
[ Info: Precompiling RCall [6f49c342-dc21-5d91-9882-a32aef131414]
[ Info: Precompiling JellyMe4 [19ac8677-9a15-4623-9afd-84acc6165ce7]

julia> model = (m, nonzero);
julia> @rput model;
[ Info: contrasts on interaction terms are assumed to decompose into products of contrasts on the first-order terms
[ Info: (if you don't know what that means, you're probably fine)
┌ Warning: Some accuracy is lost in translation for GLMMs. This is fine with plotting, but proceed with caution for inferences.
│ You can try letting the model "reconverge" in lme4 with
│     update(model, control=glmerControl(optimizer="nloptwrap", calc.derivs=FALSE)).
└ @ JellyMe4 ~/.julia/packages/JellyMe4/Hupt7/src/glmerMod.jl:144
[ Info: lme4 handles deviance for GLMMs differently than MixedModels.jl
[ Info: for the correct comparison, examine -2*logLik(RModel) and deviance(JuliaModel)

julia> R"summary(model)"
RObject{VecSxp}
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: binomial  ( logit )
Formula: Survived ~ 1 + Age + Sex + Age:Sex + (1 | Class)
Data: jellyme4_data
Control: glmerControl(optimizer = "nloptwrap", optCtrl = list(maxeval = 1),
calc.derivs = FALSE)

AIC      BIC   logLik deviance df.resid
49.0     54.9    -19.5     39.0       19

Scaled residuals:
Min      1Q  Median      3Q     Max
-2.0631 -0.7030  0.5001  0.7373  2.1385

Random effects:
Groups Name        Variance Std.Dev.
Class  (Intercept) 0.4006   0.6329
Number of obs: 24, groups:  Class, 4

Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept)        1.2618     0.9218   1.369   0.1710
AgeChild          -0.1235     1.4687  -0.084   0.9330
SexMale           -2.6101     1.2214  -2.137   0.0326 *
AgeChild:SexMale   1.8948     1.9781   0.958   0.3381
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
(Intr) AgChld SexMal
AgeChild    -0.552
SexMale     -0.666  0.416
AgChld:SxMl  0.411 -0.728 -0.617

julia> m
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
Survived ~ 1 + Age + Sex + Age & Sex + (1 | Class)
Distribution: Bernoulli{Float64}

Deviance: 2210.0511

Variance components:
Column   Variance Std.Dev.
Class (Intercept)  37.72355 6.14195

Number of obs: 24; levels of grouping factors: 4

Fixed-effects parameters:
─────────────────────────────────────────────────────────────────
Estimate  Std.Error     z value  P(>|z|)
─────────────────────────────────────────────────────────────────
(Intercept)              1.26179     3.29347   0.383119    0.7016
Age: Child              -0.123528    3.25191  -0.0379863   0.9697
Sex: Male               -2.61011     1.45291  -1.79647     0.0724
Age: Child & Sex: Male   1.89476     4.19518   0.451653    0.6515
─────────────────────────────────────────────────────────────────
``````

One thing to note is that weights don’t work in the current JellyMe4 release, so those are getting lost in translation. This suggests that something is happening with the weights in your bad run…