This paper proposes a mutation testing approach for big data processing programs that follow a data flow model, such as those implemented on top of Apache Spark. Mutation testing is a fault-based technique that relies on fault simulation by modifying programs, to create faulty versions called mutants. Mutant creation is carried on by operators able to simulate specific and well identified faults. A testing process must be able to signal faults within mutants and thereby avoid having ill behaviours within a program. We propose a set of mutation operators designed for Spark programs characterized by a data flow and data processing operations. These operators model changes in the data flow and operations, to simulate faults that take into account Spark program characteristics. We performed manual experiments to evaluate the proposed mutation operators in terms of cost and effectiveness. Thereby, we show that mutation operators can contribute to the testing process, in the construction of reliable Spark programs.
Formal specifications can provide significant support for software component reuse, as they allow tools to "understand" the semantics of the components they are manipulating. For instance, they can be of great help on the generation of reusable components through the parameterization of more specific ones, supporting the process of creation and maintenance of libraries of reusable components. In this work 1 , we concentrate on the generalization of algebraic specification components by their parameterization. Knowing that highly specific components have small chances of being reused, but that, on the other hand, if a component is too general, its reuse will often be useless; we try to preserve some set of semantic properties of a component that are considered "important" somehow. So, we propose means to identify the requirements that a formal parameter should satisfy in order to preserve part of the original component semantics in the generalization. To reach this goal, we may (or may not) consider proofs for these properties in the original context and identify the conditions under which these proofs are reproducible after generalization. In our PhD Thesis, we considered both cases; here, we concentrate in the case of known proofs. When these known proofs are rewrite proofs, a set of equations can be extracted from them and added to the formal parameter so that they are preserved in the process. This simple technique provides sufficient conditions for the validity of the considered properties in the models of the more general specification, with the advantage of being easily computed by a simple algorithm that we propose. This algorithm is to be applied in conjunction with a generalization operator that safely effectivates the generalization transformations in the component. This combination provides the means to obtain a more general specification component from which the original one is a specialization and that still satisfies a given set of equational properties with their rewrite proofs. We have also shown that more complex proofs can benefit from this result, although only partially. One of the next steps in this work is to improve the treatement of these other kinds of proofs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations –citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.