Background.
Recent years have seen an increasing interest in cross-project defect prediction (CPDP), which aims to apply defect prediction models built on source projects to a target project. Currently, a variety of (complex) CPDP models have been proposed with a promising prediction performance.
Problem.
Most, if not all, of the existing CPDP models are not compared against those simple module size models that are easy to implement and have shown a good performance in defect prediction in the literature.
Objective.
We aim to investigate how far we have really progressed in the journey by comparing the performance in defect prediction between the existing CPDP models and simple module size models.
Method.
We first use module size in the target project to build two simple defect prediction models, ManualDown and ManualUp, which do not require any training data from source projects. ManualDown considers a larger module as more defect-prone, while ManualUp considers a smaller module as more defect-prone. Then, we take the following measures to ensure a fair comparison on the performance in defect prediction between the existing CPDP models and the simple module size models: using the same publicly available data sets, using the same performance indicators, and using the prediction performance reported in the original cross-project defect prediction studies.
Result.
The simple module size models have a prediction performance comparable or even superior to most of the existing CPDP models in the literature, including many newly proposed models.
Conclusion.
The results caution us that, if the prediction performance is the goal, the real progress in CPDP is not being achieved as it might have been envisaged. We hence recommend that future studies should include ManualDown/ManualUp as the baseline models for comparison when developing new CPDP models to predict defects in a complete target project.
We demonstrate experimentally and numerically a method using the incoherent delayed self-interference (DSI) of chaotic light from a semiconductor laser with optical feedback to generate wideband chaotic signal. The results show that, the DSI can eliminate the domination of laser relaxation oscillation existing in the chaotic laser light and therefore flatten and widen the power spectrum. Furthermore, the DSI depresses the time-delay signature induced by external cavity modes and improves the symmetry of probability distribution by more than one magnitude. We also experimentally show that this DSI signal is beneficial to the random number generation.
Just-In-Time Software Defect Prediction (O-JIT-SDP) uses an online model to predict whether a new software change will introduce a bug or not. However, existing studies neglect the interaction of Software Quality Assurance (SQA) staff with the model, which may miss the opportunity to improve the prediction accuracy through the feedback from SQA staff. To tackle this problem, we propose Human-In-The-Loop (HITL) O-JIT-SDP that integrates feedback from SQA staff to enhance the prediction process. Furthermore, we introduce a performance evaluation framework that utilizes a k-fold distributed bootstrap method along with the Wilcoxon signed-rank test. This framework facilitates thorough pairwise comparisons of alternative classification algorithms using a prequential evaluation approach. Our proposal enables continuous statistical testing throughout the prequential process, empowering developers to make real-time decisions based on robust statistical evidence. Through experimentation across 10 GitHub projects, we demonstrate that our evaluation framework enhances the credibility of model evaluation, and the incorporation of HITL feedback elevates the prediction performance of online JIT-SDP models. These advancements hold the potential to significantly enhance the value of O-JIT-SDP for industrial applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.