Large scale data processing has rapidly increased in nowadays. MapReduce programming model, which is firstly mentioned in functional languages, appeared in distributed system and perform excellently in large scale data processing since 2006. Hadoop, which is the most popular framework of open-sourced MapReduce runtime environment, supplies reliable, scalable and distributed system processing large scale data across clusters of computers using this virtue programming model. In this system, files are split into many blocks and all blocks are replicated over several computers in clusters. To process these blocks efficiently, each job runs parallel and is divided into many tasks which deals with a file block. In order to fully take advantage of network bandwidth these systems, data locality is paid more and more attentions. Considering the existence of data-replica blocks, we propose a data-replicas scheduler which includes task scheduling and data allocation. The data-replicas scheduler takes fully advantage of data replicas in local Data node, reduce the costs of data transfer and improve the system performance. The results of experiments show that our scheduler not only improves the CPU ratio, but also reduces the packets that transfer in the network