Implications of ceiling effects in defect predictors

Menzies, Tim; Turhan, Burak; Bener, Ayşe; Čukić, Bojan; Jiang, Yue

doi:10.1145/1370788.1370801

Cited by 154 publications

(118 citation statements)

References 55 publications

Supporting

Mentioning

113

Contrasting

Unclassified

Order By: Relevance

“…In other words, the number of defective files is far less than the number of defect-free files. Therefore, we use the under-sampling method, which is the most suitable sampling method for our datasets [16]. The pseudocode of the prediction model is given in Figure 3.…”

Section: Construction Of the Prediction Modelmentioning

confidence: 99%

“…As a result, both increasing the efficiency of the software testing phase and delivering the software product to the market on time become possible. Reported results in software defect prediction literature suggest that further progress in defect prediction performance can be achieved by increasing the content of input data that defect predictors learn rather than using different algorithms or increasing the size of input data [17], [15], [16]. We can group some significant work in the literature in terms of their focus: algorithm driven approaches; data size driven approaches; and data content driven approaches.…”

mentioning

confidence: 99%

“…al. [16] performed a series of experiments. They used Naïve Bayes as their algorithm, since it was useful in their previous experiments [15] as well as J4.8 which was used in prior under-over sampling experiments [18], [19].…”

mentioning

confidence: 99%

See 2 more Smart Citations

Influence of confirmation biases of developers on software quality: an empirical study

Çalıklı

Bener

2012

Software Qual J

View full text Add to dashboard Cite

The thought processes of people have a significant impact on software quality, as software is designed, developed and tested by people. Cognitive biases, which are defined as deviations of human mind from the laws of logic and mathematics, are likely to cause software defects. However, there is little empirical evidence to date to substantiate this assertion. In this research, we focus on a specific cognitive bias type called confirmation bias, which is defined as the tendency of people to seek for evidence to verify hypotheses rather than seeking for evidence to falsify them. Due to confirmation bias, developers might perform unit tests to make their program work rather than to break. Hence, confirmation bias is believed to be one of the factors that lead to increased software defect density. In this research, we present a metric scheme to explore the impact of developers' confirmation bias on software defect density. In order to estimate effectiveness of our metric scheme in quantification of confirmation bias within the context of software development, we performed an empirical study that addressed the prediction of the defective parts of software. In our empirical study, we used confirmation bias metrics on five datasets obtained from two companies. Our results provide empirical evidence that human thought processes and cognitive aspects deserve further investigation to improve decision making in software development for effective process management and resource allocation.

show abstract

Section: Construction Of the Prediction Modelmentioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation

Influence of confirmation biases of developers on software quality: an empirical study

Çalıklı

Bener

2012

Software Qual J

View full text Add to dashboard Cite

show abstract

“…The question is whether there remains additional information in the data set that might be exploited to improve performance. While this is still an open question, and one to which we make some contribution through the present study, there is a view that a performance ceiling has been reached, and that the way forward lies in enriching the data with new information beyond existing metrics [12]. Nevertheless, the NASA data sets are freely available and remain attractive targets for researchers.…”

Section: Introductionmentioning

confidence: 98%

Predicting Fault-Prone Software Modules with Rank Sum Classification

Cahill

Hogan

Thomas

2013

2013 22nd Australian Software Engineering Conference

View full text Add to dashboard Cite

Abstract-The detection and correction of defects remains among the most time consuming and expensive aspects of software development. Extensive automated testing and code inspections may mitigate their effect, but some code fragments are necessarily more likely to be faulty than others, and automated identification of fault prone modules helps to focus testing and inspections, thus limiting wasted effort and potentially improving detection rates. However, software metrics data is often extremely noisy, with enormous imbalances in the size of the positive and negative classes. In this work, we present a new approach to predictive modelling of fault proneness in software modules, introducing a new feature representation to overcome some of these issues. This rank sum representation offers improved or at worst comparable performance to earlier approaches for standard data sets, and readily allows the user to choose an appropriate trade-off between precision and recall to optimise inspection effort to suit different testing environments. The method is evaluated using the NASA Metrics Data Program (MDP) data sets, and performance is compared with existing studies based on the Support Vector Machine (SVM) and Naïve Bayes (NB) Classifiers, and with our own comprehensive evaluation of these methods.

show abstract

“…Although unsupervised learning has been applied, they also have unstable performance [8], [9]. Different from unsupervised learning, the active learning reduces the number of labeled instances required to achieve a stable performance in the majority of reported results [10].…”

Section: Introductionmentioning

confidence: 99%