Abstract. Humans have limitations in processing and analyzing large amounts of data in a short time, including in terms of analyzing bankruptcy data. Bankruptcy data is one of the data that has complex information, so it requires technology that can assist in the process of analyzing and processing data more quickly and efficiently. Data science technology enables data processing and analysis on a large scale, using parallel processing techniques. Parallel processing can be implemented in machine learning models.
Purpose: Using parallel processing techniques, data science technologies enable data processing and analysis at scale. Parallel processing can be implemented in machine learning models. Therefore, this study aims to implement a machine learning model using the Light Gradient Boosting Machine (LightGBM) classification algorithm which is optimized using Extreme Gradient Boosting (XGBoost) Feature Importance to increase the accuracy of bankruptcy prediction.
Methods/Study design/approach: Bankruptcy prediction is carried out by applying LightGBM as a classification model and optimized using the XGBoost algorithm as a Feature Importance technique to improve model accuracy. the dataset used is the Taiwanese Bankruptcy dataset collected from the Taiwan Economic Journal for 1999 to 2009 and has 6,819 data. Taiwanese Bankruptcy is unbalanced data, so this study applies random oversampling.
Result/Findings: The results obtained after going through the model testing process using the confusion matrix obtained an accuracy of the performance of LightGBM+XGBoost Feature Importance of 99.227%.
Novelty/Originality/Value: So it can be concluded that the implementation of XGBoost Feature Importance can be used to improve LightGBM's performance in bankruptcy prediction.