How to add weights parameter to Generalized Mixed Model

laut · June 24, 2020, 2:35pm

How do you add the weight of an observation to a Mixed Model? I thought I could add the Freq column to wt argument, but apparently not.

using RDatasets MixedModels
titanic = RDatasets.dataset("datasets", "Titanic")
titanic.surv_flg = titanic.Survived .== "Yes";

This runs:

MixedModels.fit(GeneralizedLinearMixedModel, @formula(surv_flg ~ 1 + Age + Sex + (1 | Class)), titanic, Bernoulli(), nAGQ = 2, fast = true)

But this doesn’t

MixedModels.fit(GeneralizedLinearMixedModel, @formula(surv_flg ~ 1 + Age * Sex + (1 | Class)), titanic, wt = Freq, Bernoulli(), nAGQ = 2, fast = true)

dmbates · June 24, 2020, 3:48pm

The short answer is that the argument name is wts, not wt. Using MixedModels#master under Julia-1.4.2 I get

julia> using RDatasets, MixedModels
[ Info: Precompiling RDatasets [ce6b1742-4840-55fa-b093-852dadbb1d8b]
[ Info: Precompiling MixedModels [ff71e718-51f3-5ec2-a782-8ffcbfa3c316]

julia> titanic = RDatasets.dataset("datasets", "Titanic");

julia> m = fit(GeneralizedLinearMixedModel, @formula(Survived ~ 1 + Age + Sex + (1|Class)), titanic, Binomial(), wts=float.(titanic.Freq))
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
  Survived ~ 1 + Age + Sex + (1 | Class)
  Distribution: Binomial{Float64}
  Link: LogitLink()

  Deviance: 2227.9565

Variance components:
         Column    Variance  Std.Dev. 
Class (Intercept)  0.38409626 0.619755

 Number of obs: 32; levels of grouping factors: 4

Fixed-effects parameters:
──────────────────────────────────────────────────
             Estimate  Std.Error  z value  P(>|z|)
──────────────────────────────────────────────────
(Intercept)   1.1241    0.330702     3.40   0.0007
Age: Child    1.04557   0.243236     4.30   <1e-4
Sex: Male    -2.41576   0.139383   -17.33   <1e-66
──────────────────────────────────────────────────

mthelm85 · June 24, 2020, 3:51pm

Looks like docs need to be updated:

MixedModels.GeneralizedLinearMixedModel — Type.
GeneralizedLinearMixedModel
Generalized linear mixed-effects model representation

Fields

LMM : a LinearMixedModel - the local approximation to the GLMM.

β : the pivoted and possibly truncated fixed-effects vector

β₀ : similar to β . Used in the PIRLS algorithm if step-halving is needed.

θ : covariance parameter vector

b : similar to u , equivalent to broadcast!(*, b, LMM.Λ, u)

u : a vector of matrices of random effects

u₀ : similar to u . Used in the PIRLS algorithm if step-halving is needed.

resp : a GlmResp object

η : the linear predictor

wt : vector of prior case weights, a value of T[] indicates equal weights.

dmbates · June 24, 2020, 3:56pm

Notice that the name of the wts column must be fully qualified and it should be converted with float..

I was surprised that this worked because of the rows with zero frequency. The typical way of coding contingency-table data like this for fitting a GLM or GLMM with a Binomial distribution is to combine the No/Yes responses into a single row where the response is the proportion of Yes and the weights are the total of Yes and No. I think that a clever DataFrames.group_by call, perhaps with select or transform should achieve that.

dmbates · June 24, 2020, 3:59pm

You are right, that should be changed, but it is more a matter of inconsistency than incorrect documentation. The field is named wt but the argument in the fit method is named wts.

My favorite Oscar Wilde quote is “Consistency is the last refuge of the unimaginative”.

laut · June 24, 2020, 3:59pm

I was looking for something similar to R’s weights
glmer(formula = Survived ~ 1 + Age + Sex + (1|Class), family = binomial, weights = Freq, nAGQ = 2, data = as.data.frame(Titanic))

dmbates · June 24, 2020, 4:16pm

Isn’t that what I wrote?

julia> const GLMM = GeneralizedLinearMixedModel;

julia> form = @formula(Survived ~ 1 + Age + Sex + (1|Class));

julia> m = fit(GLMM, form, titanic, Bernoulli(), wts=float.(titanic.Freq), nAGQ=3)
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 3)
  Survived ~ 1 + Age + Sex + (1 | Class)
  Distribution: Bernoulli{Float64}
  Link: LogitLink()

  Deviance: 2227.9546

Variance components:
         Column    Variance  Std.Dev. 
Class (Intercept)  0.3841519 0.6197999

 Number of obs: 32; levels of grouping factors: 4

Fixed-effects parameters:
──────────────────────────────────────────────────
             Estimate  Std.Error  z value  P(>|z|)
──────────────────────────────────────────────────
(Intercept)   1.12414   0.330723     3.40   0.0007
Age: Child    1.04558   0.243237     4.30   <1e-4
Sex: Male    -2.41577   0.139384   -17.33   <1e-66
──────────────────────────────────────────────────

dmbates · June 24, 2020, 4:18pm

I use nAGQ=3 instead of 2 because you always evaluate the midpoint so you might as well use the next odd number.

laut · June 24, 2020, 4:25pm

I’m not sure. The results seem quite different.

R:
summary(glmer(formula = Survived ~ 1 + Age * Sex + (1|Class), family = binomial, weights = Freq, nAGQ = 2, data = as.data.frame(Titanic)))

Julia
MixedModels.fit(GeneralizedLinearMixedModel, @formula(surv_flg ~ 1 + Age + Sex + Age & Sex + (1 | Class)), titanic[titanic.Freq .> 0,:], wts = titanic[titanic.Freq .> 0,"Freq"], Bernoulli(), nAGQ = 3)

dmbates · June 24, 2020, 4:39pm

julia> form = @formula(Survived ~ 1 + Age*Sex + (1|Class));

julia> nonzero = filter(:Freq => !iszero, titanic);

julia> m = fit(GLMM, form, nonzero, Bernoulli(), wts=nonzero.Freq, nAGQ=3)
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 3)
  Survived ~ 1 + Age + Sex + Age & Sex + (1 | Class)
  Distribution: Bernoulli{Float64}
  Link: LogitLink()

  Deviance: 2210.0492

Variance components:
         Column    Variance   Std.Dev.  
Class (Intercept)  0.40040641 0.63277675

 Number of obs: 24; levels of grouping factors: 4

Fixed-effects parameters:
──────────────────────────────────────────────────────────────
                         Estimate  Std.Error  z value  P(>|z|)
──────────────────────────────────────────────────────────────
(Intercept)              1.26181    0.339321     3.72   0.0002
Age: Child              -0.123581   0.335105    -0.37   0.7123
Sex: Male               -2.61007    0.14972    -17.43   <1e-67
Age: Child & Sex: Male   1.89475    0.432308     4.38   <1e-4
──────────────────────────────────────────────────────────────

dmbates · June 24, 2020, 4:43pm

To get the same answers as the call to glmer you would need to change the order of the levels on Age and Sex.

laut · June 24, 2020, 5:00pm

#facepalm

Thanks for your help/patience.

The /beta coefficients line up now.
Followup: Any idea why the std.error, z-value, and p-values don’t match?

as.data.frame(Titanic) %>% 
    mutate(Age = relevel(x = Age, ref = "Adult"),
           Sex = relevel(x = Sex, ref = "Female")) %>%
    glmer(formula = Survived ~ 1 + Age * Sex + (1|Class), family = binomial, weights = Freq, nAGQ = 3, data = .) %>% 
    summary()

dmbates · June 24, 2020, 5:14pm

The standard errors, z-values and p-values match more-or-less with the results I posted. Keep in mind that these are even more approximate for a GLMM than for a LMM so “matching results” should be done with a rather wide tolerance.

laut · June 24, 2020, 6:40pm

@dmbates
I must be really dense today or something is off in my install (Win10-64).

Using Julia 1.4.2

I updated MixedModels

and ran your initial example code:

which for me yields

and not your results

I’ve gotten the same results with both the Julia CL and Atom IDE.

Do you have any suggestions what the problem might be?

dmbates · June 24, 2020, 7:26pm

I was using MixedModels#master as in

(@v1.4) pkg> add MixedModels#master

It seems that the coefficients are properly estimated but the standard deviation of the random effects and the standard errors are incorrect. @palday Do you recall if there was an issue related to weights in the released version v2.3.0? You have been doing much more on GLMMs while I have been down the automatic differentiation rabbit hole.

palday · June 24, 2020, 7:27pm

I was unable to reproduce the incorrect fit with MixedModels 2.3.0 – neither with nAGQ=1 nor nAGQ=3.

laut · June 24, 2020, 7:28pm

I tried to rule out the user hardware error by using a JuliaBox but ran into issues with adding the package

palday · June 24, 2020, 7:32pm

The standard error issue that I’m aware of never made it into a release – there have been some intermittent issues with weights but I don’t think they’re related.

One thing I saw when skimming the thread is that there was one fit at the top with nAGQ > 1 and fast=true. This corresponds to a configuration that simply isn’t possible in lme4, namely AGQ but while optimizing on only the random (and not the fixed) effects. In lme4, you get the Laplace approximation (nAGQ=1) when you optimize on only the RE (confusingly with the option nAGQ=0).

palday · June 24, 2020, 7:33pm

A cliche’d suggestion: can you run locally, but after restarting julia? So restart julia, do the updates, then do the fitting. Don’t update while a package is loaded.

palday · June 24, 2020, 7:42pm

I’ve been working on a RCall compatibility for MixedModels <=> lme4 in my package JellyMe4 (it’s on the Julia registry). The latest release won’t move glmer to GeneralizedMixedModel but should be able to move Bernoulli models in the other direction (and will preserve contrasts). The funny thing there is that I’m able to reproduce @laut 's fit that way:

julia> using RCall, JellyMe4
[ Info: Precompiling RCall [6f49c342-dc21-5d91-9882-a32aef131414]
[ Info: Precompiling JellyMe4 [19ac8677-9a15-4623-9afd-84acc6165ce7]

julia> model = (m, nonzero);
julia> @rput model;
[ Info: contrasts on interaction terms are assumed to decompose into products of contrasts on the first-order terms
[ Info: (if you don't know what that means, you're probably fine)
┌ Warning: Some accuracy is lost in translation for GLMMs. This is fine with plotting, but proceed with caution for inferences.
│ You can try letting the model "reconverge" in lme4 with
│     update(model, control=glmerControl(optimizer="nloptwrap", calc.derivs=FALSE)).
└ @ JellyMe4 ~/.julia/packages/JellyMe4/Hupt7/src/glmerMod.jl:144
[ Info: lme4 handles deviance for GLMMs differently than MixedModels.jl
[ Info: for the correct comparison, examine -2*logLik(RModel) and deviance(JuliaModel)

julia> R"summary(model)"
RObject{VecSxp}
Generalized linear mixed model fit by maximum likelihood (Laplace
  Approximation) [glmerMod]
 Family: binomial  ( logit )
Formula: Survived ~ 1 + Age + Sex + Age:Sex + (1 | Class)
   Data: jellyme4_data
Control: glmerControl(optimizer = "nloptwrap", optCtrl = list(maxeval = 1),  
    calc.derivs = FALSE)

     AIC      BIC   logLik deviance df.resid 
    49.0     54.9    -19.5     39.0       19 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.0631 -0.7030  0.5001  0.7373  2.1385 

Random effects:
 Groups Name        Variance Std.Dev.
 Class  (Intercept) 0.4006   0.6329  
Number of obs: 24, groups:  Class, 4

Fixed effects:
                 Estimate Std. Error z value Pr(>|z|)  
(Intercept)        1.2618     0.9218   1.369   0.1710  
AgeChild          -0.1235     1.4687  -0.084   0.9330  
SexMale           -2.6101     1.2214  -2.137   0.0326 *
AgeChild:SexMale   1.8948     1.9781   0.958   0.3381  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr) AgChld SexMal
AgeChild    -0.552              
SexMale     -0.666  0.416       
AgChld:SxMl  0.411 -0.728 -0.617

julia> m
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
  Survived ~ 1 + Age + Sex + Age & Sex + (1 | Class)
  Distribution: Bernoulli{Float64}
  Link: LogitLink()

  Deviance: 2210.0511

Variance components:
         Column   Variance Std.Dev.
Class (Intercept)  37.72355 6.14195

 Number of obs: 24; levels of grouping factors: 4

Fixed-effects parameters:
─────────────────────────────────────────────────────────────────
                         Estimate  Std.Error     z value  P(>|z|)
─────────────────────────────────────────────────────────────────
(Intercept)              1.26179     3.29347   0.383119    0.7016
Age: Child              -0.123528    3.25191  -0.0379863   0.9697
Sex: Male               -2.61011     1.45291  -1.79647     0.0724
Age: Child & Sex: Male   1.89476     4.19518   0.451653    0.6515
─────────────────────────────────────────────────────────────────

One thing to note is that weights don’t work in the current JellyMe4 release, so those are getting lost in translation. This suggests that something is happening with the weights in your bad run…

Topic		Replies	Views
How to specify weights (wts) in MixedModels.jl? Statistics mixed-models	0	241	May 11, 2023
Calling julia GeneralizedMixedModel from R Statistics	1	329	November 28, 2022
MixedModels.jl confidence intervals and repeated measure, how to get? Statistics question	9	1238	June 26, 2019
Laplace approximation for mixed models in Julia (TMB-like functionality) Statistics question	35	5200	August 10, 2021
MixedModels fit error Statistics question , package	6	1227	January 13, 2017

How to add weights parameter to Generalized Mixed Model

Related topics