Predicting and improving the academic achievement of university students is a multifactorial problem that depends on the student and the learning environment. Dealing with this complex problem is only possible with holistic approaches that include quantitative methods and data-based analyses. Considering the low success rates and high dropout rates, especially in open education programs where mass education is carried out, academic success seems to be an important research area with its causes and consequences.
This study aims to predict the academic success of students and identify those at risk. Success grades and demographic data of students enrolled in Turkish Language, Atatürk's Principles and History of Revolution, Foreign Language, and Disaster Culture courses were used in terms of data mining method.
Using data mining, the study predicted the academic success status of 26,708 students enrolled in Istanbul University open and distance education programs between 2011 and 2017. Predictions were based on common compulsory courses and demographic data. The study utilized classification models from supervised learning algorithms and was conducted using the SPSS Modeler 18 program. Initially, the entire data was divided into 70% training and 30% test data. Then, models were constructed by using Random Forest, Logistic Regression, C&R Tree, Tree-AS, C5.0, Naive Bayes, NeuralNet, CHAID, QUEST, and SVM algorithms. Model performances were compared according to accuracy, sensitivity, specificity, F1 score, positive predictive value, negative predictive value, and Matthews Correlation Coefficient (MCC) criteria. The C&RT model demonstrated the best performance, achieving the highest specificity value of 0.915.