Abstract:This work is based upon the results of an evaluation process applied over data mining techniques, in order to find the most adequate ones to extract classification rules from first-year students' academic and demographic data in relation with their academic performance. As a result of this, the formulation of a predictive model for academic performance is presented; model whose construction was achieved by analyzing, selecting and defining the classification rules that properly predict the academic performance… Show more
The main objective of this work is to make a systematic review of the literature on the prediction of the academic performance of university students by applying data mining techniques. For this purpose, an exhaustive search was carried out and after the analysis of the documentation collected, aspects such as: methodology, attributes, selection algorithms, techniques, tools, and metrics were considered, which served as the basis for the elaboration of this document. The results of the study showed that the most used methodology is KDD(database knowledge extraction), the most important attribute to achieve prediction is CGPA(academic performance), the most commonly used variable selection algorithm is InfoGain-AttributeEval, among the most efficient techniques are Naïve Bayes, Neural Networks (MLP) and Decision Tree (J48), the most used tools for the development of the models is the Weka software and finally the metrics necessary to determine the effectiveness of the model were Precision and Recall.
The main objective of this work is to make a systematic review of the literature on the prediction of the academic performance of university students by applying data mining techniques. For this purpose, an exhaustive search was carried out and after the analysis of the documentation collected, aspects such as: methodology, attributes, selection algorithms, techniques, tools, and metrics were considered, which served as the basis for the elaboration of this document. The results of the study showed that the most used methodology is KDD(database knowledge extraction), the most important attribute to achieve prediction is CGPA(academic performance), the most commonly used variable selection algorithm is InfoGain-AttributeEval, among the most efficient techniques are Naïve Bayes, Neural Networks (MLP) and Decision Tree (J48), the most used tools for the development of the models is the Weka software and finally the metrics necessary to determine the effectiveness of the model were Precision and Recall.
“…A study on academic performance conducted by [10] using the students' educational and demographic data of 932 students. Decision tree classification was used, and it was found out that socio-demographic variables like marital status, social stratum, whether the student takes day or night classes, gender, and the number of siblings influence the academic performance of a student.…”
Admission to college and selection of applications have probably become an integral part of some colleges and universities in their enrolment process, yet it is girded by controversy and skepticism. A new area of research that uses techniques of data mining is known as Educational Data Mining. It incorporates machine learning algorithms and statistical methods to help for the interpretation of student's learning habits, academic performances, and further improvements-if needed. This paper focuses on the predictive values of certain academic variables, admission tests, high school academic records as related to the performance of Information Technology (IT) students at the end of the first year. For this reason, 221 data were used, and C4.5 and Naive Bayes algorithms are applied to generate a prediction on the students' performance. The C4.5 classification gained 98.64% in 10-folds cross-validation and 96.97% in the 70% training and 30% testing percentage split compared to Naïve Bayes which only gained 89.14% and 86.36% for both 10-folds cross-validation and 70% training and 30% testing percentage split respectively. The comparative analysis of the result shows that senior high school track and academic data and admission test results are the influential attributes to the performance of IT students in their first year. This paper recommends for future studies to add different data from different years to increase the accuracy of the prediction.
“…Limited papers fall under extended [19-20, 27, 41, 43, 55, 58-60, 63-65] and holistic [14,[22][23][24]39] categories. In [25] included distance of schools from a district office to predict school's performance and accreditation within the vicinity of the district office.…”
Section: S M Muthukrishnan Et Al J Fundam Appl Sci 2017 9(4s) 7mentioning
This paper classify the various existing predicting models that are used for monitoring and improving students' performance at schools and higher learning institutions. It analyses all the areas within the educational data mining methodology. Two databases were chosen for this study and a systematic mapping study was performed. Due to the very infant stage of this research area, only 114 articles published from 2012 till 2016 were identified. Within this, a total of 59 articles were reviewed and classified. There is an increased interest and research in the area of educational data mining, particularly in improving students' performance with various predictive and prescriptive models. Most of the models are devised for pedagogical improvements ultimately. It is a huge scarcity in producing portable predictive models that fits into any educational environment. There is more research needed in the educational big data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.