Currently, one of the challenges of educational institutions is drop-out student issues. Several factors have been found and determined potentially capable to stimulate dropouts. Many researchers have been applied data mining methods to analyze, predict dropout students and also optimize finding dropout variables in advance. The main objective of this study is to find the best modeling solution in identifying dropout student predictors from 17432 student data of a private university in Jakarta. We also analyze and measure the correlation between demographic indicators and academic performance to predict student dropout using three single classifiers, K-Nearest Neighbor (KNN), Naïve Bayes (NB) and Decision Tree (DT). We found indicators such as student's attendance, homework-grade, mid-test grade, and finals-test grade, total credit, GPA, student's area, parent's income, parent's education level, gender and age as student's dropout predictors. The results only get 64.29 (NB), 64.84% (DT), and 75.27%(KNN) while we tried to combine algorithms with Ensemble Classifier Methods using Gradient Boosting as meta-classifier and gets better about 79.12%. In addition, we also get the best accuracy of about 98.82% using this method which was tested by 10-fold cross-validation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.