Purpose
This paper aims to evaluate educational data mining methods to increase the predictive accuracy of student academic performance for a university course setting. Student engagement data collected in real time and over self-paced activities assisted this investigation.
Design/methodology/approach
Classification data mining techniques have been adapted to predict students’ academic performance. Four algorithms, Naïve Bayes, Logistic Regression, k-Nearest Neighbour and Random Forest, were used to generate predictive models. Process mining features have also been integrated to determine their effectiveness in improving the accuracy of predictions.
Findings
The results show that when general features derived from student activities are combined with process mining features, there is some improvement in the accuracy of the predictions. Of the four algorithms, the study finds Random Forest to be more accurate than the other three algorithms in a statistically significant way. The validation of the best-known classifier model is then tested by predicting students’ final-year academic performance for the subsequent year.
Research limitations/implications
The present study was limited to datasets gathered over one semester and for one course. The outcomes would be more promising if the dataset comprised more courses. Moreover, the addition of demographic information could have provided further representations of students’ performance. Future work will address some of these limitations.
Originality/value
The model developed from this research can provide value to institutions in making process- and data-driven predictions on students’ academic performances.
This study investigates current approaches to learning analytics (LA) dashboarding while highlighting challenges faced by education providers in their operationalization. We analyze recent dashboards for their ability to provide actionable insights which promote informed responses by learners in making adjustments to their learning habits. Our study finds that most LA dashboards merely employ surface-level descriptive analytics, while only few go beyond and use predictive analytics. In response to the identified gaps in recently published dashboards, we propose a state-of-the-art dashboard that not only leverages descriptive analytics components, but also integrates machine learning in a way that enables both predictive and prescriptive analytics. We demonstrate how emerging analytics tools can be used in order to enable learners to adequately interpret the predictive model behavior, and more specifically to understand how a predictive model arrives at a given prediction. We highlight how these capabilities build trust and satisfy emerging regulatory requirements surrounding predictive analytics. Additionally, we show how data-driven prescriptive analytics can be deployed within dashboards in order to provide concrete advice to the learners, and thereby increase the likelihood of triggering behavioral changes. Our proposed dashboard is the first of its kind in terms of breadth of analytics that it integrates, and is currently deployed for trials at a higher education institution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.