Cross-project defect prediction (CPDP) methods can be used when the target project is a new project or lacks enough labeled program modules. In these new target projects, we can easily extract and then measure these modules with software measurement tools. However, labeling these program modules is time-consuming, error-prone and requires professional domain knowledge. Moreover, directly using labeled modules in the other projects (i.e., the source projects) can not achieve satisfactory performance due to the large data distribution difference in most cases. In this article, to our best knowledge, we are the first to propose a novel method ALTRA, which can utilize both active learning and TrAdaBoost to alleviate this issue. In particular, we firstly use Burak filter to select similar labeled modules from the source project after analyzing the unlabeled modules in the target project. Then we use active learning to choose representative unlabeled modules from the target project and ask experts to label the type (i.e., defective or non-defective) of these modules. Later, we use TrAdaBoost to determine the weights of labeled modules in the source project and the target project, and then construct the model via weighted support vector machine. After selecting a small number of modules (i.e., only 5% modules) in the target project, we terminate the method ALTRA and return the final constructed model. To show the effectiveness of our proposed method ALTRA, we choose 10 large-scale open-source projects from different application domains. In terms of both F1 and AUC performance indicators, we find ALTRA can perform significantly better than seven state-of-the-art CPDP baselines. Moreover, we also show that the usage of Burak filter, the uncertainty active learning strategy, the class imbalanced learning method and TrAdaBoost are competitive in our proposed method ALTRA.
Cross‐project defect prediction (CPDP) is needed when the target projects are new projects or the projects have less training data, since these projects do not have sufficient historical data to build high‐quality prediction models. The researchers have proposed many CPDP methods, and previous studies have conducted extensive comparisons on the performance of different CPDP methods. However, to the best of our knowledge, it remains unclear whether different CPDP methods can identify the same defective modules, and this issue has not been thoroughly explored. In this article, we select 12 state‐of‐the‐art CPDP methods, including eight supervised methods and four unsupervised methods. We first compare the performance of these methods in the same experiment settings on five widely used datasets (ie, NASA, SOFTLAB, PROMISE, AEEEM, and ReLink) and rank these methods via the Scott‐Knott test. Final results confirm the competitiveness of unsupervised methods. Then we perform diversity analysis on defective modules for these methods by using the McNemar test. Empirical results verify that different CPDP methods may lead to difference in the modules predicted as defective, especially when the comparison is performed between the supervised methods and unsupervised methods. Finally, we also find there exist a certain number of defective modules, which cannot be correctly identified by any of the CPDP methods or can be correctly identified by only one CPDP method. These findings can be utilized to design more effective methods to further improve the performance of CPDP.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.