2020
DOI: 10.1007/s42979-020-0119-4
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of Sampling-Based Ensembles of Classifiers on Imbalanced Data for Software Defect Prediction Problems

Abstract: Defect prediction in software projects plays a crucial role to reduce quality-based risk and increase the capability of detecting faulty program modules. Hence, classification approaches to anticipate software defect proneness based on static code characteristics have become a hot topic with a great deal of attention in recent years. While several novel studies show that the use of a single classifier causes the performance bottleneck, ensembles of classifiers might effectively enhance classification performan… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 25 publications
(19 citation statements)
references
References 54 publications
(60 reference statements)
0
19
0
Order By: Relevance
“…In recent years, JIT-SDP has become a research hotspot in the field of defect prediction because of its fine-grained and instant traceability. In the software defect prediction problem, Khuat et al [5] empirically evaluated the importance of sampling various classifier sets of imbalanced data by combining sampling technology and ensemble learning model and predicted positive effects for data with category imbalance problem. Zhu et al [6] proposed a just-in-time defect prediction model DAECNN-JDP based on a denoising autoencoder and convolutional neural network.…”
Section: Just-in-time Software Defect Predictionmentioning
confidence: 99%
“…In recent years, JIT-SDP has become a research hotspot in the field of defect prediction because of its fine-grained and instant traceability. In the software defect prediction problem, Khuat et al [5] empirically evaluated the importance of sampling various classifier sets of imbalanced data by combining sampling technology and ensemble learning model and predicted positive effects for data with category imbalance problem. Zhu et al [6] proposed a just-in-time defect prediction model DAECNN-JDP based on a denoising autoencoder and convolutional neural network.…”
Section: Just-in-time Software Defect Predictionmentioning
confidence: 99%
“…The prediction of defects in software systems is very important and there is great interest in the development of novel high-performance software defect predictors. The purpose of SDP models is to improve the quality of software application systems [15]. Many models have been constructed to recognize the defects in software modules using artificial intelligence and statistical methods [1,18,19,20,21,22].…”
Section: Related Workmentioning
confidence: 99%
“…This study selects imbalanced datasets from the public PROMISE repository for experimental purposes [12,13,14], so this motivates a solution such as applying the sampling methods and there is great interest in building unbiased classifiers that start from imbalanced software defect data. Although several experiments in the previous studies [12,15,16,17] are conducted based on these datasets using many ML models, very few of them are based on CNN and GRU. Even there is no experiment using CNN and GRU combined with oversampling techniques in the literature.…”
Section: Introductionmentioning
confidence: 99%
“…The longitudes and latitudes are combined. In this study, the outliers were oversampled using the SMOTE [38] algorithm based on the original 50 outliers and expanded it to a total data percentage of 50% with a difference of 5% to test the robustness and efficacy of the LOKI technique. Table 3 shows the proportions and volumes of data added.…”
Section: A Datasetmentioning
confidence: 99%