A question about the "K" of kmeans

abcde · July 6, 2020, 10:07am

What methods can assist in determining the number of classes in a cluster?what Pkg?
thanks!!!

tlienart · July 6, 2020, 10:46am

KMeans is an unsupervised method which means that your question is ill posed, in particular the fact that you talk about “classes” seems to suggest that you may be confusing clustering and classification.

For a clustering algorithm, you can assess the “quality” of the clustering using a number of metrics with a common one being the silhouette score but you should see clustering as a way to encode your data rather than a way to extract a classification rule out of it.

In terms of packages you might want to look at Clustering.jl or ParallelKMeans.jl .

bernhard · July 6, 2020, 11:13am

If you google your question you may find some blog posts and tutorials (which are likely bot related to Julia but nevertheless may be of interest to you) that provide you with ideas and approaches on the selection of K

Ali_Vahdati · July 6, 2020, 11:20am

The most common way is to choose k manually by visually inspecting data. Another method to do it the the elbow method. The elbow method involves plotting the minimized cost of the algorithm with a range of k. The cost should decrease as you increase k. Initially, cost decreases quickly, and then at a specific k, it starts to decrease slower. That specific k can be a good candidate for the number of clusters. The elbow method is not always useful because you do not always get an “elbow” in your costs curve.

Topic		Replies	Views
Question about optimal number of clusters General Usage question	3	697	August 23, 2020
Get WCSS (Within-Cluster Sum of Square) for optimal number of clusters in kmeans Statistics statistics , cluster , machine-learning , clustering	0	425	November 16, 2022
Package for clustering data points Machine Learning question , clustering	8	699	June 23, 2022
Kmeans clustering using Clustering package Machine Learning	2	438	July 18, 2022
K-Medoids clustering in BetaML.jl Data question , package , clustering	10	670	December 20, 2022

A question about the "K" of kmeans

Related topics