The advancement in technology has contributed largely to the application of data mining in education in recent times. However, selecting appropriate algorithm(s) to “mine” knowledge about educational data presents a difficult challenge to researchers and analyst. This paper contributes to the use of classification algorithms in academic performance prediction. The predictive ability of four popular algorithms; C4.5 Decision tree (CDT), Multilayer Perceptron (MLP), Naïve Bayes (NB) and Random Forest (RF) algorithms were compared. The models were built using student dataset from selected private senior high schools in Ghana. The comparative analysis of the algorithms was made based on their Accuracy, Recall, Specificity, F-Measure and Running time. On all the training and test ratios; 80:20, 70:30 and 10-fold cross validation, the results indicated that all the algorithms performed well in the classification. However, the Naïve Bayes algorithm performed significantly better than the MLP and CDT on some ratios. The running time of the NB, CDT and RF were the quickest while MLP took the longest time.
We examined a similarity measure between text documents clustering. Data mining is a challenging field with more research and application areas. Text document clustering, which is a subset of data mining helps groups and organizes a large quantity of unstructured text documents into a small number of meaningful clusters. An algorithm which works better by calculating the degree of closeness of documents using their document matrix was used to query the terms/words in each document. We also determined whether a given set of text documents are similar/different to the other when these terms are queried. We found that, the ability to rank and approximate documents using matrix allows the use of Singular Value Decomposition (SVD) as an enhanced text data mining algorithm. Also, applying SVD to a matrix of a high dimension results in matrix of a lower dimension, to expose the relationships in the original matrix by ordering it from the most variant to the lowest.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.