# CLUSTERING DATA NUMERIK MENGGUNAKAN ALGORITME X-MEANS

## (Clustering Numeric Data Using X-Means Algorithm)

### Abstract

Data mining is the extraction of new and useful information from large data sets that helps in the decision-making process. Clustering is a technique of grouping data that has similar characteristics into the same cluster. Generally, the Clustering process is used for numeric or categorical data. The K-Means algorithm is one of the algorithms that can be used for numeric type data. The stage carried out in the K-Means algorithm is to divide n observations into k clusters so that each observation is included in the cluster with the closest average (centroid), but K-Means still has a weakness in determining the number of clusters. This must be determined specifically by the user. To overcome the weakness of K-Means, the X-Means algorithm was developed by Dan Pelleg and Andre Moore. In X-Means, the value of k is estimated by inputting a range of clusters based on the dataset itself, so that no specific determination of the number of clusters is needed. The purpose of this study is to examine the X-Means algorithm. The results showed that the division of clusters in the X-Means algorithm used the Bayesian Information Criterion (BIC) value. In the X-Means algorithm, inputting a range of clusters for the number of clusters can make the clustering process more efficient.

**Keywords:** Clustering, K-Means, numeric data, X-Means.

**UNEJ e-Proceeding**, [S.l.], p. 30 - 35, aug. 2022. Available at: <https://jurnal.unej.ac.id/index.php/prosiding/article/view/33491>. Date accessed: 03 dec. 2022.