The early and accurate prediction of defects helps in testing software and therefore leads to an overall higher-quality product. Due to drift in software defect data, prediction model performances may degrade over time. Very few earlier works have investigated the significance of concept drift (CD) in software-defect prediction (SDP). Their results have shown that CD is present in software defect data and tha it has a significant impact on the performance of defect prediction. Motivated from this observation, this paper presents a paired learner-based drift detection and adaptation approach in SDP that dynamically adapts the varying concepts by updating one of the learners in pair. For a given defect dataset, a subset of data modules is analyzed at a time by both learners based on their learning experience from the past. A difference in accuracies of the two is used to detect drift in the data. We perform an evaluation of the presented study using defect datasets collected from the SEACraft and PROMISE data repositories. The experimentation results show that the presented approach successfully detects the concept drift points and performs better compared to existing methods, as is evident from the comparative analysis performed using various performance parameters such as number of drift points, ROC-AUC score, accuracy, and statistical analysis using Wilcoxon signed rank test.
Software Defect Prediction (SDP) is crucial towards software quality assurance in software engineering. SDP analyzes the software metrics data for timely prediction of defect prone software modules. Prediction process is automated by constructing defect prediction classification models using machine learning techniques. These models are trained using metrics data from historical projects of similar types. Based on the learned experience, models are used to predict defect prone modules in currently tested software. These models perform well if the concept is stationary in a dynamic software development environment. But their performance degrades unexpectedly in the presence of change in concept (Concept Drift). Therefore, concept drift (CD) detection is an important activity for improving the overall accuracy of the prediction model. Previous studies on SDP have shown that CD may occur in software defect data and the used defect prediction model may require to be updated to deal with CD. This phenomenon of handling the CD is known as CD adaptation. It is observed that still efforts need to be done in this direction in the SDP domain. In this paper, we have proposed a pair of paired learners (PoPL) approach for handling CD in SDP. We combined the drift detection capabilities of two independent paired learners and used the paired learner (PL) with the best performance in recent time for next prediction. We experimented on various publicly available software defect datasets garnered from public data repositories. Experimentation results showed that our proposed approach performed better than the existing similar works and the base PL model based on various performance measures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.