[ANN] UnsupervisedClustering.jl, a unified interface for clustering with optimization techniques

I’m happy to announce UnsupervisedClustering.jl, a Julia package that provides a consistent interface for unsupervised clustering algorithms, along with strategies to escape local optima and reduce overfitting.

UnsupervisedClustering.jl is not a new package, but it was not previously announced here. It was developed during my master’s thesis research, where we explored regularization and optimization techniques in model-based clustering. The study resulted in a published paper that introduces novel approaches to improve clustering quality and robustness.

This package has also been used in production at PSR, an energy company and contributor to the JuMP ecosystem.

One of the main advantages of UnsupervisedClustering.jl is its consistent interface across all clustering methods.

# local search algorithms
kmeans = Kmeans()

# metaheuristic algorithms
genetic = GeneticAlgorithm(local_search = kmeans)
multi_start = MultiStart(local_search = kmeans)
random_swap = RandomSwap(local_search = kmeans)

# use the fit function
result1 = fit(kmeans, data, k)
result2 = fit(genetic, data, k)
result3 = fit(multi_start, data, k)

The unified interface enables some compositions:

kmeans = Kmeans()

estimator = UnsupervisedClustering.RegularizedCovarianceMatrices.EmpiricalCovarianceMatrix(n, d)
gmm = GMM(estimator = estimator)

# Chain algorithms together
chain = ClusteringChain(kmeans, gmm)

result = fit(chain, data, k)
10 Likes