Optimasi Algoritma XGBoost Classifier Menggunakan Hyperparameter Gridesearch dan Random Search Pada Klasifikasi Penyakit Diabetes

  • Ginanjar Abdurrahman Universitas Muhammadiyah Jember
  • Hardian Oktavianto
  • Mukti Sintawati

Abstract




Classification using XGBoost in this study was applied to diabetes data originating from the UCI Machine Learning website. The initial step in this research is to deal with missing values. Missing value is found in several features. These missing values need to be handled otherwise the XGBoost algorithm will not work. Missing value handling is done by adding a meaningful value as a substitute for the missing value. At the time of modeling, the dataset is divided into training data and test data. The training data used is 80% of the number of patients, while the test data is 20%. In this study, the dataset that had imputed missing values was subjected to three treatments, first without hyperparameters, secondly hyperparameter tuning using gridsearch, and third hyperparameter tuning using random search. In the first treatment, classification using XGBoost without hyperparameters obtained a negative log loss value of 25%, which means that the performance accuracy of the algorithm reaches 75%. As for the second treatment and the third treatment, namely by using gridsearch and random search, it produces the same negative log loss value, which is 5%, which means that the performance of the algorithm reaches 95%. Thus, the performance of gridsearch and random search can significantly increase the accuracy value




Published
2022-12-22
How to Cite
ABDURRAHMAN, Ginanjar; OKTAVIANTO, Hardian; SINTAWATI, Mukti. Optimasi Algoritma XGBoost Classifier Menggunakan Hyperparameter Gridesearch dan Random Search Pada Klasifikasi Penyakit Diabetes. INFORMAL: Informatics Journal, [S.l.], v. 7, n. 3, p. 193-198, dec. 2022. ISSN 2503-250X. Available at: <https://jurnal.unej.ac.id/index.php/INFORMAL/article/view/35441>. Date accessed: 20 apr. 2024. doi: https://doi.org/10.19184/isj.v7i3.35441.