Educational data mining constitutes a recent research field which gained popularity over the last decade because of its ability to monitor students' academic performance and predict future progression. Numerous machine learning techniques and especially supervised learning algorithms have been applied to develop accurate models to predict student's characteristics which induce their behavior and performance. In this work, we examine and evaluate the effectiveness of two wrapper methods for semisupervised learning algorithms for predicting the students' performance in the final examinations. Our preliminary numerical experiments indicate that the advantage of semisupervised methods is that the classification accuracy can be significantly improved by utilizing a few labeled and many unlabeled data for developing reliable prediction models.
Machine learning has emerged as a key factor in many technological and scientific advances and applications. Much research has been devoted to developing high performance machine learning models, which are able to make very accurate predictions and decisions on a wide range of applications. Nevertheless, we still seek to understand and explain how these models work and make decisions. Explainability and interpretability in machine learning is a significant issue, since in most of real-world problems it is considered essential to understand and explain the model's prediction mechanism in order to trust it and make decisions on critical issues. In this study, we developed a Grey-Box model based on semi-supervised methodology utilizing a self-training framework. The main objective of this work is the development of a both interpretable and accurate machine learning model, although this is a complex and challenging task. The proposed model was evaluated on a variety of real world datasets from the crucial application domains of education, finance and medicine. Our results demonstrate the efficiency of the proposed model performing comparable to a Black-Box and considerably outperforming single White-Box models, while at the same time remains as interpretable as a White-Box model.
Nowadays, cryptocurrency has infiltrated almost all financial transactions; thus, it is generally recognized as an alternative method for paying and exchanging currency. Cryptocurrency trade constitutes a constantly increasing financial market and a promising type of profitable investment; however, it is characterized by high volatility and strong fluctuations of prices over time. Therefore, the development of an intelligent forecasting model is considered essential for portfolio optimization and decision making. The main contribution of this research is the combination of three of the most widely employed ensemble learning strategies: ensemble-averaging, bagging and stacking with advanced deep learning models for forecasting major cryptocurrency hourly prices. The proposed ensemble models were evaluated utilizing state-of-the-art deep learning models as component learners, which were comprised by combinations of long short-term memory (LSTM), Bi-directional LSTM and convolutional layers. The ensemble models were evaluated on prediction of the cryptocurrency price on the following hour (regression) and also on the prediction if the price on the following hour will increase or decrease with respect to the current price (classification). Additionally, the reliability of each forecasting model and the efficiency of its predictions is evaluated by examining for autocorrelation of the errors. Our detailed experimental analysis indicates that ensemble learning and deep learning can be efficiently beneficial to each other, for developing strong, stable, and reliable forecasting models.
A critical component in the computer-aided medical diagnosis of digital chest X-rays is the automatic detection of lung abnormalities, since the effective identification at an initial stage constitutes a significant and crucial factor in patient’s treatment. The vigorous advances in computer and digital technologies have ultimately led to the development of large repositories of labeled and unlabeled images. Due to the effort and expense involved in labeling data, training datasets are of a limited size, while in contrast, electronic medical record systems contain a significant number of unlabeled images. Semi-supervised learning algorithms have become a hot topic of research as an alternative to traditional classification methods, exploiting the explicit classification information of labeled data with the knowledge hidden in the unlabeled data for building powerful and effective classifiers. In the present work, we evaluate the performance of an ensemble semi-supervised learning algorithm for the classification of chest X-rays of tuberculosis. The efficacy of the presented algorithm is demonstrated by several experiments and confirmed by the statistical nonparametric tests, illustrating that reliable and robust prediction models could be developed utilizing a few labeled and many unlabeled data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.