Regularisasi model pembelajaran mesin dengan regresi terpenalti pada data yang mengandung multikolinearitas (Studi kasus prediksi Indeks Pembangunan Manusia di 34 provinsi di Indonesia)

Nur Khamidah; Kusman Sadik; Agus M Soleh; Gerry Alfa Dito

doi:10.19184/mims.v24i1.40360

Authors

Nur Khamidah IPB University
Kusman Sadik IPB University
Agus M Soleh IPB University
Gerry Alfa Dito IPB University

DOI:

https://doi.org/10.19184/mims.v24i1.40360

Abstract

This research intends to model high-dimensional data that contains multicollinearity in four machine-learning algorithms: Random Forest, K-Nearest Neighbor, XGBoost, and Regression Tree. Previously, regularization was carried out with penalized ridge regression, least absolute shrinkage and selection operator (LASSO) regression, and Elastic Net regression. A total of 100 predictor variables and 1 response variable which are the Development Index 2022 data of 34 provinces in Indonesia from BPS were used and standardized. The simulation is also applied to highly correlated data on two distributions, uniform and normal with parameter values taken from existing empirical data. The results showed that the ridge regularization method is the best for producing accurate and stable predictions. Furthermore, there was no difference in the root mean square error (RMSE) results between the data with standardization and without standardization, wherein all the data analyzed it was found that the kNN model was better than other models on simulation data, and the Random Forest and XGBoost models were better than other models on empirical data. In addition, the Regression Tree model is not recommended according to the results of this study.

Keywords: regularization, multicollinearity, ridge, LASSO, elastic net
MSC2020: 62J07

Downloads

Download data is not yet available.

Regularisasi model pembelajaran mesin dengan regresi terpenalti pada data yang mengandung multikolinearitas (Studi kasus prediksi Indeks Pembangunan Manusia di 34 provinsi di Indonesia)

Authors

DOI:

Abstract

Downloads

Downloads

Published

Issue

Section

bagian-kanan

sinta

template

cross1

issn-barcode

stat