SMOTE-Based Homogeneous Ensemble Methods for Software Defect Prediction

Balogun, Abdullateef Oluwagbemiga; Lafenwa-Balogun, Fatimah B.; Mojeed, Hammed A.; Adeyemo, Victor Elijah; Akande, Oluwatobi Noah; Akintola, Abimbola G.; Bajeh, Amos Orenyi; Usman-Hamza, Fatimah E.

doi:10.1007/978-3-030-58817-5_45

Cited by 32 publications

(22 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The results showed that no single ensemble method outperformed others in all datasets. However, the [72] 2020 Springer Link Conference International Conference on Computational Science and Its Applications (ICCSA 2020) [73] 2021 Springer Link Journal Applied Intelligence [74] 2021 Springer Link Journal Neural Computing and Applications researchers observed that the ensembles of a few ranking techniques performed better than the ensembles of many ranking techniques. In [59], a review of state-of-the-art ensemble techniques for class imbalance problems was conducted.…”

Section: Rq1 : Which Ensemble Learning Techniques Are Applied For Software Defect Prediction?mentioning

confidence: 99%

“…Their technique showed promising results with the highest AUC of 0.93 in one group of source and target datasets. In [72], researchers proposed a method using SMOTE and homogeneous ensemble methods (bagging and boosting) to improve the performance of defect prediction models. They employed DT and BN as baseline classifiers in their model.…”

Section: Rq1 : Which Ensemble Learning Techniques Are Applied For Software Defect Prediction?mentioning

confidence: 99%

“…[71] proposed a model to predict defect across projects with a HDP [72] proposed a method using SMOTE and homogeneous ensemble methods (bagging and boosting)…”

Section: Rq1 : Which Ensemble Learning Techniques Are Applied For Software Defect Prediction?mentioning

confidence: 99%

“…In [71], the AUC, recall, precision and F-measure were used to measure the performance of the proposed ensemble learning technique. In [72], the AUC, Accuracy and F-measure are employed to measure the performance of the proposed defect prediction model. In [73], five performance measures namely precision, recall, AUC (area under ROC curve), specificity, and G-means were used.…”

Section: Rq2 : Which Evaluation Criterion Is Used To Measure the Performance Of Ensemble Learners?mentioning

confidence: 99%

See 3 more Smart Citations

Software Defect Prediction Using Ensemble Learning: A Systematic Literature Review

et al. 2021

View full text Add to dashboard Cite

Recent advances in the domain of software defect prediction (SDP) include the integration of multiple classification techniques to create an ensemble or hybrid approach. This technique was introduced to improve the prediction performance by overcoming the limitations of any single classification technique. This research provides a systematic literature review on the use of the ensemble learning approach for software defect prediction. The review is conducted after critically analyzing research papers published since 2012 in four well-known online libraries: ACM, IEEE, Springer Link, and Science Direct. In this study, five research questions that cover the different aspects of research progress on the use of ensemble learning for software defect prediction are addressed. To extract the answers to identified questions, 46 most relevant papers are shortlisted after a thorough systematic research process. This study will provide compact information regarding the latest trends and advances in ensemble learning for software defect prediction and provide a baseline for future innovations and further reviews. Through our study, we discovered that frequently employed ensemble methods by researchers are the random forest, boosting, and bagging. Less frequently employed methods include stacking, voting and Extra Trees. Researchers proposed many promising frameworks, such as EMKCA, SMOTE-Ensemble, MKEL, SDAEsTSE, TLEL, and LRCR, using ensemble learning methods. The AUC, accuracy, F-measure, Recall, Precision, and MCC were mostly utilized to measure the prediction performance of models. WEKA was widely adopted as a platform for machine learning. Many researchers showed through empirical analysis that feature selection and data sampling were important pre-processing steps that improve the performance of ensemble classifiers.

show abstract

Section: Rq1 : Which Ensemble Learning Techniques Are Applied For Software Defect Prediction?mentioning

confidence: 99%

Section: Rq1 : Which Ensemble Learning Techniques Are Applied For Software Defect Prediction?mentioning

confidence: 99%

“…[71] proposed a model to predict defect across projects with a HDP [72] proposed a method using SMOTE and homogeneous ensemble methods (bagging and boosting)…”

Section: Rq1 : Which Ensemble Learning Techniques Are Applied For Software Defect Prediction?mentioning

confidence: 99%

Section: Rq2 : Which Evaluation Criterion Is Used To Measure the Performance Of Ensemble Learners?mentioning

confidence: 99%

See 2 more Smart Citations

Software Defect Prediction Using Ensemble Learning: A Systematic Literature Review

et al. 2021

View full text Add to dashboard Cite

show abstract

“…Decision tree algorithms are a family of machine learning classification and regression algorithms that fits a model on a given dataset having considered the entropy of some or all attributes for making its splitting decision. Tree-based machine learning algorithms are widely used and acceptable for various research and industrial areas, even as distant as software defect prediction in the field of software engineering [24] and even for the prediction of factors in educational management [25]. Decision Tree models are known to always produce interpretable models.…”

Section: B Implemented Modelsmentioning

confidence: 99%

Detecting Generic Network Intrusion Attacks using Tree-based Machine Learning Methods

Alsariera¹

2021

IJACSA

View full text Add to dashboard Cite

The development Intrusion Detection System (IDS) has a solid impact in mitigating against internal and external cyber threats among other cybersecurity methods. The machine learning-based method for IDS has proven to be an effective approach to detecting either anomaly or multiple classes of intrusion. For the detection of various types of intrusion by a single IDS model, it is discovered that the overall high accuracy of the IDS model does not translate to high accuracy for each attack type. Some intrusion attacks are seen to share similarities with other attacks thereby evading detection, one of which is the generic attack. The notoriety of the generic attack is the ability of a single generic attack to compromise a whole bunch of blockciphers. Therefore, this study proposed a machine learning framework to specifically detect generic network intrusion by implementing two (2) decision tree algorithms. The decision tree methods were developed using two distinct variants namely the J48 and Random Tree algorithms. A balanced generic network dataset was curated and used for model development. A 10-fold cross-validation technique was implemented for model development and performance evaluation, where all obtainable performance scores were extracted and presented. The performances of the decision tree methods for generic network intrusion attack detection were comparative analysis and also evaluated against existing methods. The proposed methods of this study are robust, stable and empirically seen to have outperformed existing methods.

show abstract