I am happy to announce the package ExpectationMaximization.jl.
The purpose is to implement in a very generic “Julia way” EM algorithm to find the maximum likelihood estimator (fit_mle
) for MixtureModels
i.e. mixture of distributions.
I basically just had to write the pseudocode of the algorithm and rely on the Distributions.jl
package that implements the fit_mle(::Type{Distributions}, y[, w])
estimator I need.
The result is a package able to fit all mixture of distributions covered by Distributions.jl
. This is different from other R, Python packages where the distributions available are the one hand-coded by the package manager (“Top-Down” approach).
I also added a few fit_mle
methods (like for product distributions), which I plan to add directly in Distributions.jl
soon.
For example, you can do:
-
Mixture of Univariate distributions (all crazy combination you want)
-
Mixture of Multivariate distributions (famous Gaussian Mixtures, but also Bernoulli mixture for MNIST)
-
Mixture of mixtures (not seen that anywhere else)
-
More? (We just discussed
Copula.jl
+ExpectationMaximization.jl
)
I did benchmark, it is Julia fast, meaning that it beats other Python (like Scikit-Learn), R existing packages (most are specialized for Normal distribution) with my basic Julia knowledge.
(However, I am always happy to speed up).
The last point is the that I use a slightly different convention than the Distribution.jl
package, as I use the instance version of fit_mle(D::Distribution, y)
and not fit_mle(Type{D}, y)
. This is discussed in several places, like in the docs or in this PR#1670.