Imbalanced datasets affect the performance of classification algorithms in predicting student performance. There are several techniques in combatting class imbalance and one of the most known is the Synthetic Minority Oversampling Technique (SMOTE). It is an oversampling technique that generates synthetic data along the line of the minority instances and its neighbors. However, it has a drawback on the distribution and generation of noisy samples which is the main reason of its many variations. In the cluster approaches for SMOTE, Affinity Propagation (AP) SMOTE is one of them. This approach uses affinity propagation to automatically produce clusters and cluster exemplars used to select the clusters to be oversampled. This way, the sparsity and generation of noisy samples will be avoided. The data used for the study is the student performance of freshman students of Davao Oriental State College of Science and Technology (DOSCST) as well as their enrolment data. The dataset comprises 10 features and 2112 instances, the imbalance ratio between majority and minority is 17.85. SMOTE and AP SMOTE are applied to the imbalanced dataset. The output is used in the J48 and Naïve Bayes classifiers to predict the student at risk of getting low performance in their freshman years in the college. The classifiers' performance is evaluated using f-measure, g-mean, and Areas Under the Curve (AUC). Results showed that AP SMOTE outperforms the original SMOTE with a percentage lead of .60%, .88%, 1.2% using the J48 classifier. The percentage lead for Naïve Bayes is 3.2%, 6.58%, 3.30%, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.