Iβm reasonably sure thatβs the correct answer (because thatβs the only one that make sense in terms of other things we know are going on).
That said, there are only 4 levels of class, so weβre already really pushing the limits of what we should be fitting. After all, we know that variance estimates (outside of a mixed models context) based on 4 observations generally wonβt be great.
Iβm not sure whatβs going on there, but MixedModels 2.3.0 does have Julia 1.3 listed in its compatibility requirements, so itβs not surprising that it doesnβt run on earlier versions (in part because 1.3 has some threading and linear algebra enhancements that we take advantage of).
My guess is that the optimum isnβt well defined and that weβre seeing two local optima. I donβt have time at the moment to check the profiles (the code for this isnβt finalized in MixedModels.jl and I β and I suspect @dmbates too β have spent too much time in Julia lately to remember how to produce a profile plot quickly there). Because looking at the random effects β which is where the magic happens β weβre seeing approximately the same two answers again and again. Given that there are only 4 groups, this isnβt too surprising, but I would have to double check.
If I use the de-tabulated data in Julia: R write.csv(x = tidyr::uncount(data = data.frame(Titanic), weights = data.frame(Titanic)$Freq)[,-5], file = "C://Users/laut/Desktop/titanic.tsv") Julia titanic2 = CSV.File("C:\\Users\\laut\\Desktop\\titanic.tsv"; delim = "\t", header = true) |> DataFrame; MixedModels.fit(GeneralizedLinearMixedModel, @formula(Survived ~ 1 + Age * Sex + (1 | Class)), titanic2, Bernoulli(), nAGQ = 1)
The problem seems to be a difference between the released version of the MixedModels package (v2.3.0) and the master branch on the repository (v3.0.0-DEV). In v2.3.0 the estimated standard deviation of the random effects is over 6 in the model with the interaction term
$ julia-1.4.2 --project
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.4.2 (2020-05-23)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
(tmp) pkg> st
Status `/tmp/Project.toml`
[a93c6f00] DataFrames v0.21.3
[ff71e718] MixedModels v2.3.0
[6f49c342] RCall v0.13.7
julia> using DataFrames, MixedModels, RCall
julia> titanic = filter(:Freq => !iszero, rcopy(R"as.data.frame(datasets::Titanic)"));
julia> const GLMM = GeneralizedLinearMixedModel;
julia> form = @formula Survived ~ 1 + Age*Sex + (1|Class);
julia> m = fit(GLMM, form, titanic, Bernoulli(), wts=titanic.Freq, nAGQ=3)
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 3)
Survived ~ 1 + Age + Sex + Age & Sex + (1 | Class)
Distribution: Bernoulli{Float64}
Link: LogitLink()
Deviance: 2210.0492
Variance components:
Column Variance Std.Dev.
Class (Intercept) 37.752020 6.144267
Number of obs: 24; levels of grouping factors: 4
Fixed-effects parameters:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Estimate Std.Error z value P(>|z|)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
(Intercept) 0.423055 4.02214 0.105182 0.9162
Age: Adult -1.77117 2.73242 -0.648208 0.5169
Sex: Female 0.715531 3.93831 0.181685 0.8558
Age: Adult & Sex: Female 1.89466 4.19524 0.451621 0.6515
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
whereas in v3.0.0-DEV (which requires Julia-v1.4.0 or later)
$ julia-1.4.2 --project
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.4.2 (2020-05-23)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
(tmp) pkg> add MixedModels#master
Updating git-repo `https://github.com/JuliaStats/MixedModels.jl.git`
Updating registry at `~/.julia/registries/General`
Updating git-repo `https://github.com/JuliaRegistries/General.git`
Resolving package versions...
Updating `/tmp/Project.toml`
[ff71e718] β MixedModels v2.3.0 β v3.0.0-DEV #master (https://github.com/JuliaStats/MixedModels.jl.git)
Updating `/tmp/Manifest.toml`
[69666777] + Arrow v0.2.4
[becb17da] + Feather v0.5.6
[53afe959] + FlatBuffers v0.5.4
[ff71e718] β MixedModels v2.3.0 β v3.0.0-DEV #master (https://github.com/JuliaStats/MixedModels.jl.git)
(tmp) pkg> st
Status `/tmp/Project.toml`
[a93c6f00] DataFrames v0.21.3
[ff71e718] MixedModels v3.0.0-DEV #master (https://github.com/JuliaStats/MixedModels.jl.git)
[6f49c342] RCall v0.13.7
julia> using DataFrames, MixedModels, RCall
[ Info: Precompiling MixedModels [ff71e718-51f3-5ec2-a782-8ffcbfa3c316]
julia> titanic = filter(:Freq => !iszero, rcopy(R"as.data.frame(datasets::Titanic)"));
julia> const GLMM = GeneralizedLinearMixedModel;
julia> form = @formula Survived ~ 1 + Age*Sex + (1|Class);
julia> m = fit(GLMM, form, titanic, Bernoulli(), wts=titanic.Freq, nAGQ=3)
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 3)
Survived ~ 1 + Age + Sex + Age & Sex + (1 | Class)
Distribution: Bernoulli{Float64}
Link: LogitLink()
Deviance: 2210.0492
Variance components:
Column Variance Std.Dev.
Class (Intercept) 0.40044704 0.63280885
Number of obs: 24; levels of grouping factors: 4
Fixed-effects parameters:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Estimate Std.Error z value P(>|z|)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
(Intercept) 0.422887 0.414341 1.02 0.3074
Age: Adult -1.77117 0.281569 -6.29 <1e-9
Sex: Female 0.715344 0.405834 1.76 0.0780
Age: Adult & Sex: Female 1.89475 0.432309 4.38 <1e-4
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Sorry that this took so long to straighten out. I guess we better get v3.0.0 released soon.
The reason that I used RCall instead of RDatasets to obtain the data set is to ensure that it is a more faithful copy of the R dataset. The factor level orders are the same as those in R. If you create a CSV file then read the CSV, which I believe is what happens in RDatasets, you canβt guarantee you will get the same ordering.
One of the big changes between 2.3 and 3.0-DEV is that fast=false fits use a two-stage optimization processes in 2.3, but a single stage in 3.0-DEV. There are a few examples where this leads to very different fitsβ¦
You can install the dev version with ]add MixedModels#master
The easiest way is to go into the package manager REPL and
add MixedModels#master
as shown in the transcript.
If you do it in a call to Pkg.add you have to remember how to create a PackageSpec to get a particular branch and I always need to look that up.
I can confirm thatβs the case. I removed MixedModel v2.3.0, added MixedModels#master, recompiled and ran the same code snippet and obtained the same result you showed.