Given two different sets of clusters obtained after two independent runs of the
ID: 3918379 • Letter: G
Question
Given two different sets of clusters obtained after two independent runs of the K-Means algorithm to cluster some unlabeled dataset, how do you evaluate the quality of these two set of clusters? (Just describe evaluation criteria without mathematical expressions)Given two different sets of clusters obtained after two independent runs of the K-Means algorithm to cluster some unlabeled dataset, how do you evaluate the quality of these two set of clusters? (Just describe evaluation criteria without mathematical expressions)
Given two different sets of clusters obtained after two independent runs of the K-Means algorithm to cluster some unlabeled dataset, how do you evaluate the quality of these two set of clusters? (Just describe evaluation criteria without mathematical expressions)
Explanation / Answer
To measure the quality of the clusters obtained in independent runs of the algorithm, the easiest way is to consider the Sum of Squared Error (SSE). As the optimization algorithm aims to minimize the distance of each point from its respective cluster centre, SSE measures the optimal objective value thus obtained in two independent runs to estimate the better clustering.
Besides this, the scatter criteria can also be used as evaluation criteria. Scatter Criteria is derived from the scatter matrices, reflecting the within-cluster scatter, the between-cluster scatter and their summation — the total scatter matrix. In other words, it considers the variance information. Lesser than within-class scatter (variance), better the clustering method.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.