[ANN] Metida.jl: experimental package for mixed-effects models with flexible covariance structure

Metida.jl is a Julia package for fitting mixed-effects models with flexible covariance structure. At this moment package is in early development stage.

Main goal to make reproducible output corresponding to SAS/SPSS.

Now implemented covariance structures:

  • Scaled Identity (SI)
  • Variance Components / Diagonal (VC)
  • Autoregressive (AR)
  • Heterogeneous Autoregressive (ARH)
  • Heterogeneous Compound Symmetry (CSH)

Usage:

LMM(model, data; subject = nothing, random = nothing, repeated = nothing)

where

model is a fixed-effect model (@formula), example: @formula(var ~ sequence + period + formulation)

random vector of random effects or single random effect. Effect can be declared like this: VarEffect(@covstr(formulation), CSH). @covstr is a effect model: @covstr(formulation). CSH is a CovarianceType structure. Premade constants: SI, VC, AR, ARH, CSH.

repeated is a repeated effect (only single).

subject is a block-diagonal factor.

Examples:

using Metida, StatsBase, StatsModels, CSV, DataFrames
df = CSV.File(dirname(pathof(Metida))*"\\..\\test\\csv\\df0.csv") |> DataFrame

# EXAMPLE 1
################################################################################
################################################################################
################################################################################
#=
PROC MIXED data=df0;
CLASSES subject sequence period formulation;
MODEL  var = sequence period formulation/ DDFM=SATTERTH s;
RANDOM  formulation/TYPE=CSH SUB=subject G V;
REPEATED/GRP=formulation SUB=subject R;
RUN;

REML: 10.06523862
=#

lmm = LMM(@formula(var ~ sequence + period + formulation), df;
random   = VarEffect(@covstr(formulation), CSH),
repeated = VarEffect(@covstr(formulation), VC),
subject  = :subject)

fit!(lmm)
#=
Linear Mixed Model: var ~ sequence + period + formulation
Random 1:
   Model: formulation
   Type: HeterogeneousCompoundSymmetry (3)
   Coefnames: ["formulation: 1", "formulation: 2"]
Repeated:
   Model: formulation
   Type: VarianceComponents (2)
   Coefnames: ["formulation: 1", "formulation: 2"]

Status: converged

   -2 logREML: 10.0652

   Fixed effects:

Name             Estimate     SE
(Intercept)      1.57749      0.334543
sequence: 2      -0.170833    0.384381
period: 2        0.195984     0.117228
period: 3        0.145014     0.109171
period: 4        0.157363     0.117228
formulation: 2   -0.0791667   0.0903709

Random effects:

   θ vector: [0.455584, 0.367656, 1.0, 0.143682, 0.205657]

Random 1   formulation: 1   var   0.207557
Random 1   formulation: 2   var   0.135171
Random 1   Rho              rho   1.0
Repeated   formulation: 1   var   0.0206445
Repeated   formulation: 2   var   0.0422948
=#

# EXAMPLE 2
################################################################################
################################################################################
################################################################################
#=
PROC MIXED data=df0;
CLASSES subject sequence period formulation;
MODEL  var = sequence period formulation/ DDFM=SATTERTH s;
RANDOM  formulation/TYPE=VC SUB=subject G V;
REPEATED/GRP=formulation SUB=subject R;
RUN;

REML: 16.06148160
=#

lmm = LMM(
    @formula(var ~ sequence + period + formulation), df;
    random   = VarEffect(@covstr(formulation), SI),
    repeated = VarEffect(@covstr(formulation), VC),
    subject  = :subject,
)
fit!(lmm)
#=
Linear Mixed Model: var ~ sequence + period + formulation
Random 1:
   Model: formulation
   Type: ScaledIdentity (1)
   Coefnames: ["formulation: 1", "formulation: 2"]
Repeated:
   Model: formulation
   Type: VarianceComponents (2)
   Coefnames: ["formulation: 1", "formulation: 2"]

Status: converged

   -2 logREML: 16.0615

   Fixed effects:

Name             Estimate     SE
(Intercept)      1.57212      0.305807
sequence: 2      -0.170833    0.279555
period: 2        0.204087     0.289957
period: 3        0.155769     0.11308
period: 4        0.160015     0.289957
formulation: 2   -0.0791667   0.279555

Random effects:

   θ vector: [0.412436, 0.145184, 0.220819]

Random 1   Var              var   0.170103
Repeated   formulation: 1   var   0.0210784
Repeated   formulation: 2   var   0.048761
=#

#EXAMPLE 3
################################################################################
################################################################################
################################################################################
#=
PROC MIXED data=df0;
CLASSES subject sequence period formulation;
MODEL  var = sequence period formulation/ DDFM=SATTERTH s;
RANDOM  subject/TYPE=VC G V;
RUN;

REML: 10.86212458
=#

lmm = LMM(@formula(var ~ sequence + period + formulation), df;
    random = VarEffect(@covstr(subject), SI)
    )
fit!(lmm)
#=
Linear Mixed Model: var ~ sequence + period + formulation
Random 1:
   Model: subject
   Type: ScaledIdentity (1)
   Coefnames: ["subject: 1", "subject: 2", "subject: 3", "subject: 4", "subject: 5"]
Repeated:
   Model: nothing
   Type: ScaledIdentity (1)
   Coefnames: -

Status: converged

   -2 logREML: 10.8621

   Fixed effects:

Name             Estimate     SE
(Intercept)      1.61         0.309774
sequence: 2      -0.170833    0.383959
period: 2        0.144167     0.116706
period: 3        0.08         0.115509
period: 4        0.144167     0.116706
formulation: 2   -0.0791667   0.0833617

Random effects:

   θ vector: [0.410574, 0.182636]

Random 1   Var   var   0.168571
Repeated   Var   var   0.0333559
=#

#EXAMPLE 4
################################################################################
################################################################################
################################################################################
#=
PROC MIXED data=df0;
CLASSES subject sequence period formulation;
MODEL  var = sequence period formulation/ DDFM=SATTERTH s;
RANDOM  period/TYPE=VC G V;
RANDOM  formulation/TYPE=VC G V;
RUN;

REML: 25.12948063
=#
lmm = LMM(
    @formula(var ~ sequence + period + formulation), df;
    random   = [VarEffect(@covstr(period), VC), VarEffect(@covstr(formulation), VC)]
)
fit!(lmm)

#=
Linear Mixed Model: var ~ sequence + period + formulation
Random 1:
   Model: period
   Type: VarianceComponents (4)
   Coefnames: ["period: 1", "period: 2", "period: 3", "period: 4"]
Random 2:
   Model: formulation
   Type: VarianceComponents (2)
   Coefnames: ["formulation: 1", "formulation: 2"]
Repeated:
   Model: nothing
   Type: ScaledIdentity (1)
   Coefnames: -

Status: converged

   -2 logREML: 25.1295

   Fixed effects:

Name             Estimate     SE
(Intercept)      1.61         0.249491
sequence: 2      -0.170833    0.192487
period: 2        0.144167     0.269481
period: 3        0.08         0.266717
period: 4        0.144167     0.269481
formulation: 2   -0.0791667   0.192487

Random effects:

   θ vector: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.421717]

Random 1   period: 1        var   0.0
Random 1   period: 2        var   0.0
Random 1   period: 3        var   0.0
Random 1   period: 4        var   0.0
Random 2   formulation: 1   var   0.0
Random 2   formulation: 2   var   0.0
Repeated   Var              var   0.177845
=#

#EXAMPLE 5
################################################################################
################################################################################
################################################################################
#=
PROC MIXED data=df0;
CLASSES subject sequence period formulation;
MODEL  var = sequence period formulation/ DDFM=SATTERTH s;
RANDOM  formulation/TYPE=UN(1) SUB=subject G V;
RUN;

REML: 16.24111264
=#

lmm = LMM(@formula(var ~ sequence + period + formulation), df;
    random  = VarEffect(@covstr(formulation), VC),
    subject = :subject)
fit!(lmm)

#=
Linear Mixed Model: var ~ sequence + period + formulation
Random 1:
   Model: formulation
   Type: VarianceComponents (2)
   Coefnames: ["formulation: 1", "formulation: 2"]
Repeated:
   Model: nothing
   Type: ScaledIdentity (1)
   Coefnames: -

Status: converged

   -2 logREML: 16.2411

   Fixed effects:

Name             Estimate     SE
(Intercept)      1.61         0.334718
sequence: 2      -0.170833    0.277378
period: 2        0.144167     0.289463
period: 3        0.08         0.117047
period: 4        0.144167     0.289463
formulation: 2   -0.0791667   0.277378

Random effects:

   θ vector: [0.447322, 0.367367, 0.185068]

Random 1   formulation: 1   var   0.200097
Random 1   formulation: 2   var   0.134959
Repeated   Var              var   0.0342502
=#
################################################################################

9 Likes

New version of Metida.jl is registered.

NLopt.jl (:LN_BOBYQA) and Optim.jl (Newton) performance tested with random datasets. Memory consumption depends on “subject” size - Pic.1. And two optimization approaches based on computation covariance matrix (V) by blocks and with full V computation. Pic.2. Also more than one random factor with individual structure and “subject” can be specified. For subject size more than 100-150 it works very slow, but it’s not surprising, because inversing of covariance matrix is used. For big Z matrix size MixedModels.jl should be used.

Pic.1
plot100-VC-VC-NLopt-Optim

Pic. 2

plot120-VC-VC

1 Like

How to calculate mixed model when covariance structure type is fa0(2) in SAS.
For example:
proc mixed data=highcv;
class subject seq per trt;
model lcmax=seq per trt/ddfm=satterth;
random trt/sub=subject type=fa0(2);
repeated /group=trt sub=subject;
estimate “组别” trt -1 1/cl alpha=0.1;
ods output Estimates=est;
run;

Hi!
fa0(2) - not implemented in pre-defined covariance structure list. So, as I can see - your data from PK study (bioequivalence?). For 2 trt levels if you have T and R formulation you can use Heterogeneous Compaend Symmetry structure (it exactly the same for this case, and FDA guideline mentioned that it can be used).

using Metida
# load your data (highcv)
lmm = LMM(@formula(lcmax~seq+per+trt), highcv;
    random = VarEffect(@covstr(trt|subject), CSH),
    repeated = VarEffect(@covstr(trt|subject), DIAG),
    )
fit!(lmm)
# then you can get CI
confint(lmm; level = 0.9)
# or you can get CI with DDF you need
df = 38
lci =  coef(lmm)[end] - stderror(lmm)[end]*quantile(TDist(df), 1-0.1/2)
uci =  coef(lmm)[end] + stderror(lmm)[end]*quantile(TDist(df), 1-0.1/2)
 

Version 0.11 of Metida.jl released.

  • Some changes in public API;
  • Multithreading with Polyester.jl;
  • New estimate() & contrast() function;
  • Sweep logdet round-off error correction to prevent negative values;
  • Minor output changes;
  • square and identity link functions;
  • C gradient stored in ModelResult if once calculated;
  • Handle missings values;
  • Fix bugs;
  • Documents update;
  • Improove test coverage.

I hope that performance with Polyester will be better (but it can take i little more memory).

3 Likes