Mutation testing realises the idea of using artificial defects to support testing activities. Mutation is typically used as a way to evaluate the adequacy of test suites, to guide the generation of test cases and to support experimentation.Mutation has reached a maturity phase and gradually gains popularity both in academia and in industry. This chapter presents a survey of recent advances, over the past decade, related to the fundamental problems of mutation testing and sets out the challenges and open problems for the future development of the method. It also collects advices on best practices related to the use of mutation in empirical studies of software testing. Thus, giving the reader a 'mini-handbook'-style roadmap for the application of mutation testing as experimental methodology.
Mutation testing realises the idea of fault-based testing, i.e., using artificial defects to guide the testing process. It is used to evaluate the adequacy of test suites and to guide test case generation. It is a potentially powerful form of testing, but it is well-known that its effectiveness is inhibited by the presence of equivalent mutants. We recently studied Trivial Compiler Equivalence (TCE) as a simple, fast and readily applicable technique for identifying equivalent mutants for C programs. In the present work, we augment our findings with further results for the Java programming language. TCE can remove a large portion of all mutants because they are determined to be either equivalent or duplicates of other mutants. In particular, TCE equivalent mutants account for 7.4% and 5.7% of all C and Java mutants, while duplicated mutants account for a further 21% of all C mutants and 5.4% Java mutants, on average. With respect to a benchmark ground truth suite (of known equivalent mutants), approximately 30% (for C) and 54% (for Java) are TCE equivalent. It is unsurprising that results differ between languages, since mutation characteristics are language-dependent. In the case of Java, our new results suggest that TCE may be particularly effective, finding almost half of all equivalent mutants.
Abstract-Mutation testing is used extensively to support the experimentation of software engineering studies. Its application to real-world projects is possible thanks to modern tools that automate the whole mutation analysis process. However, popular mutation testing tools use a restrictive set of mutants which do not conform to the community standards as supported by the mutation testing literature. This can be problematic since the effectiveness of mutation depends on its mutants. We therefore examine how effective are the mutants of a popular mutation testing tool, named PIT, compared to comprehensive ones, as drawn from the literature and personal experience. We show that comprehensive mutants are harder to kill and encode faults not captured by the mutants of PIT for a range of 11% to 62% of the Java classes of the considered projects.
Mutation analysis is a well-studied, fault-based testing technique. It requires testers to design tests based on a set of artificial defects. The defects help in performing testing activities by measuring the ratio that is revealed by the candidate tests. Unfortunately, applying mutation to real-world programs requires automated tools due to the vast number of defects involved. In such a case, the effectiveness of the method strongly depends on the peculiarities of the employed tools. Thus, when using automated tools, their implementation inadequacies can lead to inaccurate results. To deal with this issue, we cross-evaluate four mutation testing tools for Java, namely PIT, MUJAVA, Major and the research version of PIT, PIT RV , with respect to their fault-detection capabilities. We investigate the strengths of the tools based on: a) a set of real faults and b) manual analysis of the mutants they introduce. We find that there are large differences between the tools' effectiveness and demonstrate
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.