Summary
Context
MapReduce is a processing model used in Big Data to facilitate the analysis of large data under a distributed architecture.
Objective
The aim of this study is to identify and categorize the state of the art of software testing in MapReduce applications, determining trends and gaps.
Method
Systematic mapping study to discuss and classify according to international standards 54 relevant studies in relation to reasons for testing, types of testing, quality characteristics, test activities, tools, roles, processes, test levels, and research validations.
Results
The principal reasons for testing MapReduce applications are performance issues, potential failures, issues related to the data, or to satisfy the agreements with efficient resources. The efforts are focused on performance and, to a lesser degree, on functionality. Performance testing is carried out through simulation and evaluation, whereas functional testing considers some program characteristics (such as specification and structure). Despite the type of testing, the majority of efforts are focused at the unit and integration test levels of the specific MapReduce functions without considering other parts of the technology stack.
Conclusions
Researchers have both opportunities and challenges in performance and functional testing, and there is room to improve their research though the use of mature and standard validation methods.