MixedModels v2.0.0

dmbates · July 26, 2019, 5:10pm

I plan to release v"2.0.0" of the MixedModels package on Aug. 1. This version is compatible with StatsModels v"0.6.0" (a.k.a. Terms2.0), which allows for much greater flexibility in formula specification. Also, the internals have been rewritten extensively but that should not be obvious to users.

If you use the MixedModels package I would appreciate it if you could install and test the master branch to let me know of difficulties in advance of the release.

Christopher_Fisher · July 26, 2019, 6:20pm

I encountered some version conflicts with Distributions when I tried to switch to master:

(v1.1) pkg> add MixedModels #master
  Updating git-repo `https://github.com/dmbates/MixedModels.jl.git`
 Resolving package versions...
ERROR: Unsatisfiable requirements detected for package Distributions [31c24e10]:
 Distributions [31c24e10] log:
 ├─possible versions are: [0.1.0-0.1.4, 0.2.0-0.2.13, 0.3.0, 0.6.4-0.6.7, 0.7.0-0.7.6, 0.8.0-0.8.10, 0.9.0, 0.10.0-0.10.2, 0.11.0-0.11.1, 0.12.0-0.12.5, 0.13.0, 0.14.0-0.14.2, 0.15.0, 0.16.0-0.16.4, 0.17.0, 0.18.0, 0.19.1-0.19.2, 0.20.0, 0.21.0] or uninstalled
 ├─restricted to versions [0.11.0-0.11, 0.12.0-0.12, 0.13.0-0.13, 0.14.0-0.14, 0.15.0-0.15, 0.16.0-0.16, 0.17.0-0.17, 0.18.0-0.18, 0.19.0-0.19, 0.20.0-0.20] by MixedModels [ff71e718], leaving only versions [0.11.0-0.11.1, 0.12.0-0.12.5, 0.13.0, 0.14.0-0.14.2, 0.15.0, 0.16.0-0.16.4, 0.17.0, 0.18.0, 0.19.1-0.19.2, 0.20.0]
 │ └─MixedModels [ff71e718] log:
 │   ├─possible versions are: 2.0.0 or uninstalled
 │   └─MixedModels [ff71e718] is fixed to version 2.0.0
 └─restricted to versions 0.21.0 by an explicit requirement — no versions left

(v1.1) pkg> st Distributions
    Status `~/.julia/environments/v1.1/Project.toml`
  [31c24e10] Distributions v0.21.0
  [1fd47b50] QuadGK v2.1.0
  [2913bbd2] StatsBase v0.31.0
  [4c63d2b9] StatsFuns v0.8.0

dmbates · July 26, 2019, 6:46pm

Thanks for the report. The Project.toml file for MixedModels#master allows up to v"0.20" of Distributions and I have just pushed a commit to allow v"0.21" but on my systems I still have Distributions at v"0.19.2".

It looks as if MixedModels depends on some other package that restricts the version of Distributions. Can anyone suggest how I would determine the reverse dependencies of Distributions and intersect that with the direct and implied dependencies of MixedModels.

As one might expect, I only use Distributions in one place and that usage is for a call that has been part of the package since its inception.

Nosferican · July 28, 2019, 9:45pm

Will the

fit(LinearMixedModel,
    @formula(Y ~ 1 + A + B + (1|G)),
    data)

syntax be supported in MixedModels v2.0.0 or just the

fit!(LinearMixedModel(@formula(Y ~ 1 + A + B + (1|G)),
                      data))

syntax?

On that note, it seems the ability to pass contrasts has been removed on master.

dmbates · July 29, 2019, 3:25pm

The reason I prefer the fit!(LinearMixedModel(@formula(...), ...) form is because there are arguments to the fit! method that are don’t apply to the model construction. Things like verbose, maxiter, …

Also the user can modify the form of the model between the time that it is constructed and its being fit!. For a LinearMixedModel the user can switch to REML estimation. For a GeneralizedLinearMixedModel, nAGQ, the number of quadrature points in the adaptive Gauss-Hermite evaluation can be changed.

In the vanilla case, it seems to me that there is little overhead in writing fit!(LinearMixedModel(@formula(...), ...) compared to fit(LinearMixedModel, @formula(...), ...), but I will admit that I sometimes botch the first form myself by typing a comma after LinearMixedModel.

It probably wouldn’t be a big deal to allow the fit(LinearMixedModel, @formula(...), ...) form as an alternative in the vanilla case. Do you think it is worthwhile?

Nosferican · July 29, 2019, 3:41pm

I do agree that it is nice to separate arguments that are part of the model being created from those used in the fitting process from a design approach. However, I think there shouldn’t be ambiguity among arguments through a fit method. Since the arguments are passed as keyword arguments the fit method should easily pass the model builder arguments and fitting process without issues. For users, it might be easier to not have to worry about determining if in one implementation an argument is considered part of the model or of the fitting process which may vary depending on the implementation and internals. I tend to favor smart/robust which might take a bit more logic from the developer side, but make it closer to fool-proof for end-users.

The other point which makes the fit method very nice is that it allows for a seamless interface with other packages. For example, a package may decide to use one model type based on features of the Formula, but if one implementation only provides fit! it requires a more elaborate handling.

Lastly, in case only fit! is provided, the fit method should throw the default no hasn’t been implemented for the type exception rather than the current error behavior.

Disclosure, losing the fit method would force me to update the code in Bioequivalence.jl so I would rather just handle it at the MixedModels level for the next release.

Juan · July 29, 2019, 7:43pm

Any changes related with the ability to manage datasets larger than the RAM memory?

Nosferican · July 29, 2019, 8:16pm

Popular request at JuliaCon for the whole stats ecosystem… I don’t think anything has happened just yet, but we are well aware of the need…

Topic		Replies	Views
Interpreting Pkg add errors New to Julia	1	406	October 6, 2019
PSA: breaking changes in StatsModels v0.6.0 (Terms 2.0: Son of Terms) Statistics	6	1592	July 2, 2019
Mixedmodels.jl branches: nlmm and dropmulαβ! Statistics	2	479	August 31, 2019
Unsatisfiable requirements General Usage	3	130	January 22, 2025
Dependency Hell New to Julia question , package	10	1990	June 15, 2019

MixedModels v2.0.0

Related topics