lme4::lmList equivalent in Julia

The lme4 package in R has the function lmList, which I use heavily to examine shrinkage in mixed-models. MixedModels.jl does not seem to have the function. MixedModelsMakie.jl is great to visualize shrinkage, but it does not give the me the subgroup data analyses.

Is there a equivalent method in Julia to fit lm or glm to different subgroups of the data?

There’s no comparable function in Julia. Using the same technique that MixedModelsMakie uses, you could do:

using MixedModels
using DataFrames

"""
    unshrunken_raneftables(m::LinearMixedModel, θref=10^4 .* m.θ; uscale::Bool=false)

Compute the conditional modes at θref.

!!! warn
    This function is **not** thread safe because it temporarily mutates
    the passed model before restoring its original form.
"""
function unshrunken_raneftables(m::LinearMixedModel, θref=10^4 .* m.θ; uscale::Bool=false)
    vv = try
        raneftables(updateL!(setθ!(m, θref)); uscale)
    catch e
        @error "Failed to compute unshrunken values with the following exception:"
        rethrow(e)
    finally
        updateL!(setθ!(m, m.optsum.final)) # restore parameter estimates and update m
    end

    return vv
end

model = fit(MixedModel, @formula(reaction ~ 1 + days + (1+days|subj)), MixedModels.dataset(:sleepstudy))

# returns a named tuple of tables, where each entry in the named tuple is a grouping factor
unshrunken = unshrunken_raneftables(model)

# look at the entries for grouping factor
df = DataFrame(unshrunken[:subj])

# add in the fixed effects to get approximate OLS fits --
# not quite OLS because there is still some minimal shrinkage going on,
# but that's necessary to make this work for more complicated models with 
# multiple grouping factors or fixed effects without associated random effects

pseudo_ols = transform(df, "(Intercept)" => ByRow(x -> x + fixef(model)[1]),
                       "days" => ByRow(x -> x + fixef(model)[2]); renamecols=false)

We can also use DataFrame’s convenient groupby and do real OLS for this simple example:

# now we do the same thing using DataFrames and GLM
sleepstudy = DataFrame(MixedModels.dataset(:sleepstudy))

ols = DataFrame()
for gdf in groupby(sleepstudy, :subj)
    group = only(unique(gdf.subj))
    m = fit(LinearModel, @formula(reaction ~ 1 + days), gdf)
    row = NamedTuple(Symbol.(coefnames(m)) .=> coef(m))
    row = merge(row, (; subj=group))
    push!(ols, row)
end

To double check you get similar answers:

sort!(ols, :subj)
sort!(pseudo_ols, :subj)
all(isapprox.(ols[!, "(Intercept)"], pseudo_ols[!, "(Intercept)"]; atol=0.001))
all(isapprox.(ols[!, "days"], pseudo_ols[!, "days"]; atol=0.001))
2 Likes

Thanks a lot, unshrunken_raneftables is what I was look for. That’s much more efficient than ols on grouped data.

for shake of completeness: LinearModel requires using GLM

1 Like