I am working on project that required doing multilevel models and the dataset is quite large (7 million rows). My current implementation is in R but takes quite a bit of time using glmmTMB package. So, I was wondering whether this can be replicated using MixedModels.jl with the hopes that it will be faster.
My main issue is how to add an offset term and a spline. As for the offset I saw some old posts stating that it hasn’t been implemented but wondering if it has by now.
Here is my current R implementation
library(glmmTMB)
glmm_formula_hosp <- hospitalisation_cum_lead7 ~
publicholiday +
dow +
splines::ns(date) +
hw_mean_ehf_severity +
age_band3 +
sex +
(1 | sa2_2021_code) +
stats::offset(log(population_interp))
m_glmm <- glmmTMB::glmmTMB(
formula = glmm_formula_hosp,
data = df,
family = glmmTMB::nbinom1(link = "log")
)
Post from four years ago discusses offsets.. Offsets are available. Where you will have to make a decision is whether to use Poisson() as an approximation to the negative binomial if overdispersion is mild. Splines2 provides an equivalent to R’s ns. If negative binomial is a must-have, there’s Turing but performance will be slow.
As @technocrat mentioned, offsets are supported for GLMMs in MixedModels.jl. We don’t have support for negative binomial models and it’s not currently on my list of things that I have enough personal interest in to implement. (If somebody else were to contribute code adding support for negative binomial models, that would be most welcome.) I am more interested in splines support, but we don’t have support built into the package. Splines2.jl doesn’t support the @formula macro out of the box, but the package maintainer does provide some information on how to add it. If you did that, then splines should just () work.
It’s been a while since I looked at Splines2.jl, but a few things have happened in Julia since then, which would make adding official spline support much more straightforward. Realistically, I won’t have time to try this for at least a few days (which can easily turn into much longer since this is a hobby project of mine and I have a day job).