The Implementation of Minimum Forest Graph for Centroid Updating Process on K-Means Algorithm

  • Achmad Maududie

Abstract




K-Means is one of algorithms based on partitioning clustering method, particularly on sum-of-squared error criterion. It generates a single partition data for a single group of data that has high degree in similarity. This method has some advantages such as linear complexity, ease of interpretation, simplicity of implementation, speed of convergence, adaptability to sparse data, and versatile in almost every aspect. However, this method also has some weaknesses, such as very sensitive to initial centroids (center) that drives the quality of clustering output. Although there is a recommendation to make some runs with different initial centroids and select the initial centroid that produces cluster with minimum error, frequently, this procedure does not achieve a satisfying result.


This paper introduces a new method to overcome this problem through enhancing the refinement mechanism in K-Means algorithm. This method focuses on rebuilding new centroids using minimum forest graphs to reproduce better models in the refinement mechanism. Based on the conducted experiments, the proposed method yielded better value of gain information and Dunn index than Forgy method in both relatively well distributed and relatively noisy dataset. Furthermore, the members of each cluster in every run of the conducted experiments were consistent for the proposed method, while it was not happen for Forgy method.




Published
2023-07-25
How to Cite
MAUDUDIE, Achmad. The Implementation of Minimum Forest Graph for Centroid Updating Process on K-Means Algorithm. INFORMAL: Informatics Journal, [S.l.], v. 3, n. 3, p. 67-76, july 2023. ISSN 2503-250X. Available at: <https://jurnal.unej.ac.id/index.php/INFORMAL/article/view/10067>. Date accessed: 03 may 2024.