The increase of power consumption makes the cost of cluster operation higher. One approach for reducing power consumption is to establish a cluster with small nodes which equip a low-power, high-performance processor. Since many lowpower consumed nodes do not have storage devices, a separate storage system is required to store large-volume data while nodes mount this storage space to save data. When a Hadoop cluster is configured in such a condition, each node's access to a storage results in excessive network load and delays the execution of Hadoop Map tasks. In this study, we propose a newmap task scheduling policy for Hadoop. This policy transmits multiple splits to nodes at once to reduce network load. In addition, local storage space of nodes is used as a cache for a split, which shortens the time to access splits, so this policy can reduce the execu tion time of Hadoop applications.Keywords: Hadoop; ARM; cluster; map; scheduling. Biographical notes: Bongen Gu received his PhD in Computer Engineering at Kyungpook University. He is currently a Professor at the Department of Computer Engineering, Korea National University of Transportation. He leads the Computer System Laboratory (CSL). His research interests include issues related to high-performance computing, parallel system, storage, embedded system, mobile system and agricultural product monitoring system. Yoonsik Kwak received his PhD in Electronics Engineering at Kyung-Hee University. He is currently a Professor at the Department of Computer Engineering, Korea National University of Transportation. He leads the Microprocessor and Embedded System Laboratory (MESL). His research interests include issues related to embedded system, sensor network application and agricultural product monitoring system. Copyright © 2016 Inderscience Enterprises Ltd.
66
B. Gu and Y. KwakThis paper is a revised and expanded version of a paper entitled 'TERMbased MAP task scheduling in an ARM-based Hadoop cluster' presented at Int.