CLUSTERING DATA NUMERIK MENGGUNAKAN ALGORITME X-MEANS

(Clustering Numeric Data Using X-Means Algorithm)

  • Ayya Agustina Riza Program Studi Matematika Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Sebelas Maret Surakarta
  • Dewi Retno Sari Saputro Program Studi Matematika Fakultas Matematika dan Ilmu Pengetahuan Alam Universitas Sebelas Maret Surakarta

Abstract

Data mining is the extraction of new and useful information from large data sets that helps in the decision-making process. Clustering is a technique of grouping data that has similar characteristics into the same cluster. Generally, the Clustering process is used for numeric or categorical data. The K-Means algorithm is one of the algorithms that can be used for numeric type data. The stage carried out in the K-Means algorithm is to divide n observations into k clusters so that each observation is included in the cluster with the closest average (centroid), but K-Means still has a weakness in determining the number of clusters. This must be determined specifically by the user. To overcome the weakness of K-Means, the X-Means algorithm was developed by Dan Pelleg and Andre Moore. In X-Means, the value of k is estimated by inputting a range of clusters based on the dataset itself, so that no specific determination of the number of clusters is needed. The purpose of this study is to examine the X-Means algorithm. The results showed that the division of clusters in the X-Means algorithm used the Bayesian Information Criterion (BIC) value. In the X-Means algorithm, inputting a range of clusters for the number of clusters can make the clustering process more efficient.


Keywords: Clustering, K-Means, numeric data, X-Means.

Published
2022-08-14
How to Cite
RIZA, Ayya Agustina; SAPUTRO, Dewi Retno Sari. CLUSTERING DATA NUMERIK MENGGUNAKAN ALGORITME X-MEANS. UNEJ e-Proceeding, [S.l.], p. 30 - 35, aug. 2022. Available at: <https://jurnal.unej.ac.id/index.php/prosiding/article/view/33491>. Date accessed: 26 apr. 2024.