Kehan Gao scite author profile

The selection of software metrics for building software quality prediction models is a search-based software engineering problem. An exhaustive search for such metrics is usually not feasible due to limited project resources, especially if the number of available metrics is large. Defect prediction models are necessary in aiding project managers for better utilizing valuable project resources for software quality improvement. The efficacy and usefulness of a fault-proneness prediction model is only as good as the quality of the software measurement data. This study focuses on the problem of attribute selection in the context of software quality estimation. A comparative investigation is presented for evaluating our proposed hybrid attribute selection approach, in which feature ranking is first used to reduce the search space, followed by a feature subset selection. A total of seven different feature ranking techniques are evaluated, while four different feature subset selection approaches are considered. The models are trained using five commonly used classification algorithms. The case study is based on software metrics and defect data collected from multiple releases of a large real-world software system. The results demonstrate that while some feature ranking techniques performed similarly, the automatic hybrid search algorithm performed the best among the feature subset selection methods. Moreover, performances of the defect prediction models either improved or remained unchanged when over 85% of the software metrics were eliminated.

show abstract

Attribute Selection and Imbalanced Data: Problems in Software Defect Prediction

Khoshgoftaar

Gao

Seliya

2010

118

View full text Add to dashboard Cite

A comparative study of iterative and non-iterative feature selection techniques for software defect prediction

et al. 2013

View full text Add to dashboard Cite

An application of zero-inflated Poisson regression for software fault prediction

Khoshgoftaar

Gao

Szabo³

View full text Add to dashboard Cite

A Comprehensive Empirical Study of Count Models for Software Fault Prediction

Gao

Khoshgoftaar

2007

IEEE Trans. Rel.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kehan Gao

Choosing software metrics for defect prediction: an investigation on feature selection techniques

Attribute Selection and Imbalanced Data: Problems in Software Defect Prediction

A comparative study of iterative and non-iterative feature selection techniques for software defect prediction

An application of zero-inflated Poisson regression for software fault prediction

A Comprehensive Empirical Study of Count Models for Software Fault Prediction

Contact Info

Product

Resources

About