Prototyping and testing distributed systems is considered to be a hard task because it is not always possible to reproduce a given sequence of events. While simulations may help on this task, they cannot replace test and validation with real systems. In this paper we present Docker-Hadoop, a container-based virtualization platform designed to prototype, test and deploy MapReduce applications and systems. This tool allowed us to test and reproduce fault-tolerance scenarios that are especially interesting in the context of the PER-MARE project, which aims at adapting the Hadoop framework to the case pervasive systems. Indeed, we developed a fault-tolerant component that can circumvent the limitations from original Hadoop and prevent the job scheduling stall in the case of failures or network disconnections. Thanks to Docker-Hadoop, we could easily prototype and test our improved Hadoop, with the first scalability and speedup results being presented in this paper.
This study presents the advances on designing and implementing scalable techniques to support the development and execution of MapReduce application in pervasive distributed computing infrastructures, in the context of the PER-MARE project. A pervasive framework for MapReduce applications is very useful in practice, especially in those scientific, enterprises and educational centers which have many unused or underused computing resources, which can be fully exploited to solve relevant problems that demand large computing power, such as scientific computing applications, big data processing, etc. In this study, we propose the study of multiple techniques to support volatility and heterogeneity on MapReduce, by applying two complementary approaches: Improving the Apache Hadoop middleware by including context-awareness and fault-tolerance features; and providing an alternative pervasive grid implementation, fully adapted to dynamic environments. The main design and implementation decisions for both alternatives are described and validated through experiments, demonstrating that our approaches provide high reliability when executing on pervasive environments. The analysis of the experiments also leads to several insights on the requirements and constraints from dynamic and volatile systems, reinforcing the importance of context-aware information and advanced fault-tolerance features to provide efficient and reliable MapReduce services on pervasive grids.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.