Effort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models

Yang, Yibiao; Zhou, Yuming; Liu, Jinping; Zhao, Yangyang; Lü, Hongmin; Xu, Lei; Xu, Baowen; Leung, Hareton

doi:10.1145/2950290.2950353

Cited by 204 publications

(219 citation statements)

References 42 publications

Supporting

Mentioning

212

Contrasting

Order By: Relevance

“…They say a "good" defect predictor selects the 20% of files containing 80% of the defects In the literature, this 20/80 rule is often called P opt 20 (the percent of the bugs found after reading 20%). P opt 20 is widely used in the literature and, for details on that measure, we refer the reader to those publications [18], [42], [48], [62], [64], [69], [69], [111]. For this paper, all we need to say about P opt 20 is the conclusions reached from this metric are nearly the same as the conclusions reached via G-score.…”

Section: Evaluation Criteriamentioning

confidence: 98%

Better Data Labelling With EMBLEM (and how that Impacts Defect Prediction)

2022

IIEEE Trans. Software Eng.

View full text Add to dashboard Cite

Standard automatic methods for recognizing problematic development commits can be greatly improved via the incremental application of human+artificial expertise. In this approach, called EMBLEM, an AI tool first explore the software development process to label commits that are most problematic. Humans then apply their expertise to check those labels (perhaps resulting in the AI updating the support vectors within their SVM learner). We recommend this human+AI partnership, for several reasons. When a new domain is encountered, EMBLEM can learn better ways to label which comments refer to real problems. Also, in studies with 9 open source software projects, labelling via EMBLEM's incremental application of human+AI is at least an order of magnitude cheaper than existing methods (≈ eight times). Further, EMBLEM is very effective. For the data sets explored here, EMBLEM better labelling methods significantly improved Popt20 and G-score performance in nearly all the projects studied here. TABLE 1This paper argues against using keywords like these as a method for labelling a commit as "buggy'.

show abstract

Section: Evaluation Criteriamentioning

confidence: 98%

Better Data Labelling With EMBLEM (and how that Impacts Defect Prediction)

2022

IIEEE Trans. Software Eng.

View full text Add to dashboard Cite

show abstract

“…Kamei's features contain 14 features in total, which have been proposed and validated by Kamei et al and used by Yang et al, and Yang et al in change classification. Kamei's features are divided into diffusion, size, purpose, history, and experience dimensions.…”

Section: Change Classification Methodologymentioning

confidence: 99%

“…For effort‐aware just‐in‐time bug prediction, Yang et al found that simple unsupervised models could be better than supervised models. However, Huang et al showed that the method proposed by Yang et al has some disadvantages so they proposed an improved supervised model called CBS , which is better than the state‐of‐the‐art supervised model(ie, EALR ). At the same time, compared with the LT method proposed by Yang et al, CBS can also get significant reduces on context switches and false alarms when obtaining similar recall.…”

Section: Related Workmentioning

confidence: 99%

“…For effort-aware just-in-time bug prediction, Yang et al 32 found that simple unsupervised models could be better than supervised models. However, Huang et al 48 showed that the method proposed by Yang et al 32 has some disadvantages so they proposed an improved supervised model called CBS, which is better than the state-of-the-art supervised model (ie, EALR).…”

Section: Bug Prediction In Software Code Changesmentioning

confidence: 99%

See 1 more Smart Citation

An empirical study of software change classification with imbalance data‐handling methods

et al. 2018

View full text Add to dashboard Cite

Summary Bug prediction in software code changes can help developers to find out and fix bugs immediately when they are introduced, thus to improve the effectiveness and validity of bug fixing. In data mining, this problem can be regarded as a change classification task. However, one of its key characteristics, ie, class‐imbalance, holds back the performance of standard classification methods. In this paper, we consider a quantity of imbalance data‐handling methods and extract a more comprehensive groups of change features, aiming to achieve better change classification performance. Two different types of imbalance data‐handling methods, namely, resampling and ensemble learning methods, are employed. Especially, we explore the performance of their combination. To compare the performance of different imbalance data‐handling methods, an experiment with 10 open source projects is conducted. Four classification methods, including J48, Naïve Bayes, SMO, and Random Forest, are used as standard classifiers and as the base classifiers, respectively. Moreover, contribution of different groups of change features are evaluated. Experimental results show that imbalance data‐handling methods can improve the performance of change classification and the combination methods, which take advantage of both ensemble learning and resampling, perform better than using ensemble learning methods or resampling methods individually. Of the studied imbalance data‐handling methods, the combination of Bagging and random undersampling with J48 as the base classifier yields out better prediction results than those achieved by other methods. Additionally, of the collected change features, text vector features accounts for a larger proportion than others.

show abstract

“…Additionally, researchers have turned their attention to how defect prediction research should be conducted, e.g., reducing the bias through sampling approaches [10], the impact of hyper parameter tuning [11], suitable baseline comparisons [12] or general guidelines that should be considered [13]. While all of the above contribute to the advancement of the defect prediction state of the art, there are also multiple publications that question the progress of the state of the art through replications in recent years, as they demonstrate that older (e.g., [4]) or trivial (e.g., [14], [15]) approaches are comparable too or even better than more complex recent approaches from the state of the art.…”

Section: Introductionmentioning

confidence: 99%

On the Costs and Profit of Software Defect Prediction

Herbold

2021

IIEEE Trans. Software Eng.

View full text Add to dashboard Cite

Defect prediction can be a powerful tool to guide the use of quality assurance resources. However, while lots of research covered methods for defect prediction as well as methodological aspects of defect prediction research, the actual cost saving potential of defect prediction is still unclear. Within this article, we close this research gap and formulate a cost model for software defect prediction. We derive mathematically provable boundary conditions that must be fulfilled by defect prediction models such that there is a positive profit when the defect prediction model is used. Our cost model includes aspects like the costs for quality assurance, the costs of post-release defects, the possibility that quality assurance fails to reveal predicted defects, and the relationship between software artifacts and defects. We initialize the cost model using different assumptions, perform experiments to show trends of the behavior of costs on real projects. Our results show that the unrealistic assumption that defects only affect a single software artifact, which is a standard practice in the defect prediction literature, leads to inaccurate cost estimations. Moreover, the results indicate that thresholds for machine learning metrics are also not suited to define success criteria for software defect prediction.Index Terms-Defect prediction, costs, return on investment ! • S. Herbold is with the

show abstract

Effort-aware just-in-time defect prediction: simple unsupervised models could be better than supervised models

Cited by 204 publications

References 42 publications

Better Data Labelling With EMBLEM (and how that Impacts Defect Prediction)

Better Data Labelling With EMBLEM (and how that Impacts Defect Prediction)

An empirical study of software change classification with imbalance data‐handling methods

On the Costs and Profit of Software Defect Prediction

Contact Info

Product

Resources

About