MapReduce is presently established as an important distributed and parallel programming model with wide acclaim for large scale computing. Intelligent scheduling decisions can help in reducing the overall runtime of the jobs. MapReduce performance is currently limited by its default scheduler, which does not adapt well in heterogeneous environments. Heterogeneous environments were considered in Longest Approximate Time to End scheduler. This too has several shortcomings due to the static manner in which it computes progress of tasks. The lack of adequate approach to heterogeneous environments is currently being taken up in recent research. In this paper, we propose a novel MapReduce scheduler in heterogeneous environments based on Reinforcement learning called MapReduce Reinforcement Learning scheduler, which observes the system state of task execution and suggests speculative re-execution of the slower tasks to other available nodes in the cluster for faster execution. The proposed approach adapts to the heterogeneous environment and no prior knowledge of the environmental characteristics are required. It is expected that over a few runs the system would be able to better map the computing requirements to the resources available in a heterogeneous cluster and minimizes the overall job completion time.c 2015 The Authors. Published by Elsevier B.V. Peer-review under responsibility of scientific committee of 2nd International Symposium on Big Data and Cloud Computing (ISBCC'15).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.