2017
DOI: 10.48550/arxiv.1705.03125
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Affinity Scheduling and the Applications on Data Center Scheduling with Data Locality

Abstract: MapReduce framework is the de facto standard in Hadoop. Considering the data locality in data centers, the load balancing problem of map tasks is a special case of affinity scheduling problem. There is a huge body of work on affinity scheduling, proposing heuristic algorithms which try to increase data locality in data centers like Delay Scheduling and Quincy. However, not enough attention has been put on theoretical guarantees on throughput and delay optimality of such algorithms. In this work, we present and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2018
2018
2019
2019

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 10 publications
0
2
0
Order By: Relevance
“…Using the insights of this paper, for data parallel processing of big data, more sophisticated algorithms based on MapReduce can be used for speeding up the processing time, e.g. look at [36,39,5,40,19].…”
Section: Related Workmentioning
confidence: 99%
“…Using the insights of this paper, for data parallel processing of big data, more sophisticated algorithms based on MapReduce can be used for speeding up the processing time, e.g. look at [36,39,5,40,19].…”
Section: Related Workmentioning
confidence: 99%
“…After all map tasks are processed, one or multiple reduce tasks combine the results of all map tasks to produce the final result on the big data-set. Note that processing of big data-sets are either map-intensive or only consist of map task [14], [15], [16], [17], [18], so our focus in this paper is on map task scheduling. Each map task can be processed by any of the servers, but the service rate is faster in the servers that have the data chunk associated with the task, which are referred as local servers.…”
Section: Introductionmentioning
confidence: 99%