Software defect prediction (SDP) is an active research field in software engineering to identify defect-prone modules. Thanks to SDP, limited testing resources can be effectively allocated to defect-prone modules. Although SDP requires sufficient local data within a company, there are cases where local data are not available, e.g., pilot projects. Companies without local data can employ cross-project defect prediction (CPDP) using external data to build classifiers. The major challenge of CPDP is different distributions between training and test data. To tackle this, instances of source data similar to target data are selected to build classifiers. Software datasets have a class imbalance problem meaning the ratio of defective class to clean class is far low. It usually lowers the performance of classifiers. We propose a Hybrid Instance Selection Using Nearest-Neighbor (HISNN) method that performs a hybrid classification selectively learning local knowledge (via k-nearest neighbor) and global knowledge (via naïve Bayes). Instances having strong local knowledge are identified via nearest-neighbors with the same class label. Previous studies showed low PD (probability of detection) or high PF (probability of false alarm) which is impractical to use. The experimental results show that HISNN produces high overall performance as well as high PD and low PF.
Fault localization techniques are used to deduce the exact source of a failure from a set of failure indications while debugging software and play a crucial role in improving software quality. Mutation‐based fault localization (MBFL) techniques are proposed to localize faults at a finer granularity and with higher accuracy than traditional fault localization techniques. Despite the technique's effectiveness, the immense cost of mutation analysis hinders MBFL's practical application in the industry. Various mutation alternative strategies are utilized to lower the cost of MBFL, but they sacrifice the accuracy of localization results. Higher‐order mutation testing was proposed to search for valuable mutants that drive testing harder and reduce the overall test effort. However, higher‐order mutants (HOMs) never have been used to address the cost problem of MBFL to the extent of our knowledge. This paper proposes a novel, cost‐effective MBFL technique called HOTFUZ, Higher‐Order muTation‐based FaUlt localiZation, that employs HOMs to reduce the cost while minimizing the accuracy degradation. HOTFUZ combines mutants of a program under test into HOMs to decrease the number of mutants by more than half, depending on the order of HOMs. An experimental study is conducted using 65 real‐world faults of CoREBench to assess the proposed approach's cost‐effectiveness. The experimental results show that HOTFUZ outperforms the extant mutation alternative strategies by localizing faults more accurately using the same number of mutants executed. HOTFUZ has three main benefits over existing mutant reduction techniques for MBFL: (a) It keeps the advantage of using the whole set of mutation operators; (b) it does not discard generated mutants randomly for the sake of efficiency; and, finally, (c) it significantly decreases the proportion of equivalent mutants.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.