2021
DOI: 10.3390/app11146663
|View full text |Cite
|
Sign up to set email alerts
|

A Paired Learner-Based Approach for Concept Drift Detection and Adaptation in Software Defect Prediction

Abstract: The early and accurate prediction of defects helps in testing software and therefore leads to an overall higher-quality product. Due to drift in software defect data, prediction model performances may degrade over time. Very few earlier works have investigated the significance of concept drift (CD) in software-defect prediction (SDP). Their results have shown that CD is present in software defect data and tha it has a significant impact on the performance of defect prediction. Motivated from this observation, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(9 citation statements)
references
References 45 publications
0
9
0
Order By: Relevance
“…Figure 1 (Bangash, Sahar et al 2020) evaluated the stability of cross-project defect prediction models by measuring their performance estimates in different periods, considering G-measure, MCC, AUC, and F-Score as performance measures. Gangwar et al (Gangwar, Kumar et al 2021) proposed a paired learner-based drift detection method in SDP that examines a subset of data at the same time by two learners (stable and reactive) to detect drift by exploring the dissimilarity of their predictions using AUC as a measure of model instability. Kabir et al (Kabir, Keung et al 2021) assumed that previous software releases are labeled (clean or defective) to train inter-release DP models (IRDP), evaluating their strength against CD using AUC, Recall, and pf performance stability measures.…”
Section: Related Evaluation Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Figure 1 (Bangash, Sahar et al 2020) evaluated the stability of cross-project defect prediction models by measuring their performance estimates in different periods, considering G-measure, MCC, AUC, and F-Score as performance measures. Gangwar et al (Gangwar, Kumar et al 2021) proposed a paired learner-based drift detection method in SDP that examines a subset of data at the same time by two learners (stable and reactive) to detect drift by exploring the dissimilarity of their predictions using AUC as a measure of model instability. Kabir et al (Kabir, Keung et al 2021) assumed that previous software releases are labeled (clean or defective) to train inter-release DP models (IRDP), evaluating their strength against CD using AUC, Recall, and pf performance stability measures.…”
Section: Related Evaluation Methodsmentioning
confidence: 99%
“…In literature, MCC and Brier Score measures are calculated using distance between the predicted and actual class probability by considering the test data label. Whenever this distance significantly increases, it is assumed that a CD has occurred (Gangwar, Kumar et al 2021). However, we propose a new method that identifies CD by checking the distance between repeated predictions without considering the class label of the test data.…”
Section: The Methodology Of Changing the Distribution Of Frequent For...mentioning
confidence: 99%
“…The prediction model performance may degrade over time if chronology is important, which needs resolving by retraining using consistent data. This phenomenon is called Concept Drift (CD) (Gangwar, Kumar et al 2021). Solving problems related to dynamic environments that require real-time or near-real-time processing (such as our problem or learning from logs and recorded events, operations, and sensor readings) is increasing.…”
Section: Introductionmentioning
confidence: 99%
“…Changes in data probability, including probability distributions, can lead to inaccurate results, and even a well-trained prediction model will become outdated in the face of such drift, as noted by Dong et al [8]. Furthermore, other researchers [9][10][11][12][13] have reported that the distribution changes among the versions in chronological defect datasets. Findings from streaming data analytics verify that if the historical data change over time, the prediction models become outdated [4].…”
Section: Introductionmentioning
confidence: 99%
“…Even a prediction model that is well trained can become obsolete in the presence of such drift, as highlighted by Dong et al [8]. Furthermore, other researchers [9][10][11][12][13] have reported that the distribution changes among the versions in chronological defect datasets. Empirical evidence from the field of streaming data analytics corroborates that, as temporal shifts occur in historical data, prediction models experience obsolescence [4].…”
Section: Introductionmentioning
confidence: 99%