Software defect prediction has been an important research topic in the software engineering field, especially to solve the inefficiency and ineffectiveness of existing industrial approach of software testing and reviews. The software defect prediction performance decreases significantly because the data set contains noisy attributes and class imbalance. Feature selection is generally used in machine learning when the learning task involves highdimensional and noisy attribute datasets. Most of the feature selection algorithms, use local search throughout the entire process, consequently near-optimal to optimal solutions are quiet difficult to be achieved. Metaheuristic optimization can find a solution in the full search space and use a global search ability, significantly increasing the ability of finding high-quality solutions within a reasonable period of time. In this research, we propose the combination of metaheuristic optimization methods and bagging technique for improving the performance of the software defect prediction. Metaherustic optimization methods (genetic algorithm and particle swarm optimization) are applied to deal with the feature selection, and bagging technique is employed to deal with the class imbalance problem. Results have indicated that the proposed methods makes an impressive improvement in prediction performance for most classifiers. Based on the comparison result, we conclude that there is no significant difference between particle swarm optimization and genetic algorithm when used as feature selection for most classifiers in software defect prediction.
The costs of finding and correcting software defects have been the most expensive activity in software development. The accurate prediction of defect-prone software modules can help the software testing effort, reduce costs, and improve the software testing process by focusing on fault-prone module. Recently, static code attributes are used as defect predictors in software defect prediction research, since they are useful, generalizable, easy to use, and widely used. However, two common aspects of data quality that can affect performance of software defect prediction are class imbalance and noisy attributes. In this research, we propose the combination of particle swarm optimization and bagging technique for improving the accuracy of the software defect prediction. Particle swarm optimization is applied to deal with the feature selection, and bagging technique is employed to deal with the class imbalance problem. The proposed method is evaluated using the data sets from NASA metric data repository. Results have indicated that the proposed method makes an impressive improvement in prediction performance for most classifiers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.