lme4::lmList equivalent in Julia

lindemann09 · August 10, 2023, 10:14am

The lme4 package in R has the function lmList, which I use heavily to examine shrinkage in mixed-models. MixedModels.jl does not seem to have the function. MixedModelsMakie.jl is great to visualize shrinkage, but it does not give the me the subgroup data analyses.

Is there a equivalent method in Julia to fit lm or glm to different subgroups of the data?

palday · August 12, 2023, 7:33pm

There’s no comparable function in Julia. Using the same technique that MixedModelsMakie uses, you could do:

using MixedModels
using DataFrames

"""
    unshrunken_raneftables(m::LinearMixedModel, θref=10^4 .* m.θ; uscale::Bool=false)

Compute the conditional modes at θref.

!!! warn
    This function is **not** thread safe because it temporarily mutates
    the passed model before restoring its original form.
"""
function unshrunken_raneftables(m::LinearMixedModel, θref=10^4 .* m.θ; uscale::Bool=false)
    vv = try
        raneftables(updateL!(setθ!(m, θref)); uscale)
    catch e
        @error "Failed to compute unshrunken values with the following exception:"
        rethrow(e)
    finally
        updateL!(setθ!(m, m.optsum.final)) # restore parameter estimates and update m
    end

    return vv
end

model = fit(MixedModel, @formula(reaction ~ 1 + days + (1+days|subj)), MixedModels.dataset(:sleepstudy))

# returns a named tuple of tables, where each entry in the named tuple is a grouping factor
unshrunken = unshrunken_raneftables(model)

# look at the entries for grouping factor
df = DataFrame(unshrunken[:subj])

# add in the fixed effects to get approximate OLS fits --
# not quite OLS because there is still some minimal shrinkage going on,
# but that's necessary to make this work for more complicated models with 
# multiple grouping factors or fixed effects without associated random effects

pseudo_ols = transform(df, "(Intercept)" => ByRow(x -> x + fixef(model)[1]),
                       "days" => ByRow(x -> x + fixef(model)[2]); renamecols=false)

We can also use DataFrame’s convenient groupby and do real OLS for this simple example:

# now we do the same thing using DataFrames and GLM
sleepstudy = DataFrame(MixedModels.dataset(:sleepstudy))

ols = DataFrame()
for gdf in groupby(sleepstudy, :subj)
    group = only(unique(gdf.subj))
    m = fit(LinearModel, @formula(reaction ~ 1 + days), gdf)
    row = NamedTuple(Symbol.(coefnames(m)) .=> coef(m))
    row = merge(row, (; subj=group))
    push!(ols, row)
end

To double check you get similar answers:

sort!(ols, :subj)
sort!(pseudo_ols, :subj)
all(isapprox.(ols[!, "(Intercept)"], pseudo_ols[!, "(Intercept)"]; atol=0.001))
all(isapprox.(ols[!, "days"], pseudo_ols[!, "days"]; atol=0.001))

lindemann09 · August 14, 2023, 10:23am

Thanks a lot, unshrunken_raneftables is what I was look for. That’s much more efficient than ols on grouped data.

for shake of completeness: LinearModel requires using GLM