Identifying the location of faults in real-world programs is one of the costly processes during software debugging. To reduce the debugging effort, various fault localization techniques have been proposed in recent years. Spectrum-based fault localization (SBFL) is one kind of widely investigated fault localization technique. Most SBFL techniques first calculate the suspiciousness of programelements (such as statements, methods) to be faulty using the coverage information and execution results of tests. Then a rank list of program elements is generated according to their suspiciousness. However, some SBFL techniques only consider the binary coverage information (i.e., whether the program element is covered) but ignore some of the tests' running behaviors, such as the execution frequency when faults occur in the iteration entities or loop bodies, which are more likely to be faulty followed the propagation-infection-execution model. The execution frequency based techniques only replace the feature items of the existing formula limiting their effectiveness in fault localization.In this article, we propose a fault localization technique, class reduction and method call frequency (CRMF), which utilizes mutation analysis and information retrieval techniques. In particular, CRMF first uses mutation analysis to identify and reduce the classes, in which the program elements with a low probability of being faulty. Then we propose a new suspiciousness formula that applies information retrieval and considers method call frequency. To evaluate the effectiveness of CRMF, we conduct empirical studies on 264 real-world programs from the Defects4J benchmark. Final results show that CRMF outperforms the statement frequency based technique FLSF and SBFL techniques (i.e., Ochiai, OP2, Tarantula, and Dstar) in both single-fault programs and multiple-fault programs. Specifically, CRMF can rank 29, 74, and 112 faults at the top 1, 3, 5 ranks and achieve a higher mean reciprocal rank for single-fault programs and multiple-fault programs. Finally, we discuss the essence of CRMF and analyze its effectiveness on multi-fault programs in detail.
Compilation errors are unavoidable during the debugging process of novice students. Compiler error messages can help novices to localize and remove errors, but these messages are difficult to understand for students. Previous studies have investigated the compilation error categorization by analyzing compiler error messages, but the categorization cannot cover all kinds of errors, which limits the evaluation of compilation error studies. Therefore, a comprehensive categorization for compilation errors is needed for evaluating the performance of models or tools related to compilation errors. In this study, we first propose a new compilation error categorization, which is based on the smallest unit of the program, tokens. The experiments on 29,573 programs from three datasets show that our proposed compilation error categorization can cover more types of errors and the distribution of the error categorization are significantly different between the datasets. Then, based on our proposed categorization, we develop a neural network model CLACER (CLAssification of Compilation ERrors) for predicting the compilation errors. The results indicate that CLACER can improve the compiler's error localization accuracy and predicts the compilation error effectively. Moreover, based on the proposed categorization, we conduct empirical studies to evaluate the performance of three repairing tools (i.e., DeepFix, RLAssist, and MACER). The comparison results illustrate that DeepFix and RLAssist can fix more errors in the category of delimiter than errors in other categories. Furthermore, MACER performs better than DeepFix and RLAssist because it has a sufficient repairing pattern set for the errors. We also provide some suggestions for improving the repairing tools in the future.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.