Educational data mining provides a way to predict student academic performance. A psychometric factor like time management is one of the major issues affecting Thai students’ academic performance. Current data sources used to predict students’ performance are limited to the manual collection of data or data from a single unit of study which cannot be generalised to indicate overall academic performance. This study uses an additional data source from a university log file to predict academic performance. It investigates the browsing categories and the Internet access activities of students with respect to their time management during their studies. A single source of data is insufficient to identify those students who are at-risk of failing in their academic studies. Furthermore, there is a paucity of recent empirical studies in this area to provide insights into the relationship between students’ academic performance and their Internet access activities. To contribute to this area of research, we employed two datasets such as web-browsing categories and Internet access activity types to select the best outcomes, and compared different weights in the time and frequency domains. We found that the random forest technique provides the best outcome in these datasets to identify those students who are at-risk of failure. We also found that data from their Internet access activities reveals more accurate outcomes than data from browsing categories alone. The combination of two datasets reveals a better picture of students’ Internet usage and thus identifies students who are academically at-risk of failure. Further work involves collecting more Internet access log file data, analysing it over a longer period and relating the period of data collection with events during the academic year.
Educators in higher education institutes often use statistical results obtained from their online Learning Management System (LMS) dataset, which has limitations, to evaluate student academic performance. This study differs from the current body of literature by including an additional dataset that advances the knowledge about factors affecting student academic performance. The key aims of this study are fourfold. First, is to fill the educational literature gap by applying machine learning techniques in educational data mining, making use of the Internet usage behaviour log files and LMS data. Second, LMS data and Internet usage log files were analysed with machine learning techniques for predicting at-risk-of-failure students, with greater explanation added by combining student demographic data. Third, the demographic features help to explain the prediction in understandable terms for educators. Fourth, the study used a range of Internet usage data, which were categorized according to type of usage data and type of web browsing data to increase prediction accuracy.
In this era of a data-driven society, useful data (Big Data) is often unintentionally ignored due to lack of convenient tools and expensive software. For example, web log files can be used to identify explicit information of browsing patterns when users access web sites. Some hidden information, however, cannot be directly derived from the log files. We may need external resources to discover more knowledge from browsing patterns. The purpose of this study is to investigate the application of web usage mining based on web log files. The outcome of this study sets further directions of this investigation on what and how implicit information embedded in log files can be efficiently and effectively extracted. Further work involves combining the use of social media data to improve business decision quality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.