I am happy to announce the package ExpectationMaximization.jl.
The purpose is to implement in a very generic “Julia way” EM algorithm to find the maximum likelihood estimator (fit_mle
) for MixtureModels
i.e. mixture of distributions.
I basically just had to write the pseudocode of the algorithm and rely on the Distributions.jl
package that implements the fit_mle(::Type{Distributions}, y[, w])
estimator I need.
The result is a package able to fit all mixture of distributions covered by Distributions.jl
. This is different from other R, Python packages where the distributions available are the one handcoded by the package manager (“TopDown” approach).
I also added a few fit_mle
methods (like for product distributions), which I plan to add directly in Distributions.jl
soon.
For example, you can do:

Mixture of Univariate distributions (all crazy combination you want)

Mixture of Multivariate distributions (famous Gaussian Mixtures, but also Bernoulli mixture for MNIST)

Mixture of mixtures (not seen that anywhere else)

More? (We just discussed
Copula.jl
+ExpectationMaximization.jl
)
I did benchmark, it is Julia fast, meaning that it beats other Python (like ScikitLearn), R existing packages (most are specialized for Normal distribution) with my basic Julia knowledge.
(However, I am always happy to speed up).
The last point is the that I use a slightly different convention than the Distribution.jl
package, as I use the instance version of fit_mle(D::Distribution, y)
and not fit_mle(Type{D}, y)
. This is discussed in several places, like in the docs or in this PR#1670.