2017
DOI: 10.1109/jsyst.2015.2496368
|View full text |Cite
|
Sign up to set email alerts
|

Bandwidth-Aware Scheduling With SDN in Hadoop: A New Trend for Big Data

Abstract: . However, all of them either ignore allocating tasks in a global view or disregard available bandwidth as the basis for scheduling. In this paper we propose a heuristic bandwidth-aware task scheduler BASS to combine Hadoop with SDN. It is not only able to guarantee data locality in a global view but also can efficiently assign tasks in an optimized way. Both examples and experiments demonstrate that BASS has the best performance in terms of job completion time. To our knowledge, BASS is the first to exploit t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
33
0
4

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 72 publications
(37 citation statements)
references
References 13 publications
0
33
0
4
Order By: Relevance
“…So far, in the Big Data scenario, most works have leveraged SDN to optimize the communication-intensive phase of Hadoop MapReduce [15] by placing MapReduce tasks close to their data, thus reducing the amount of data that must be transferred and therefore the MapReduce job completion time [29,38,43,44]. A first work that explores the tight integration of application and network control utilizing SDN has been presented by Wang et al [43], which explores the idea of application-aware networking through the design of an SDN controller using a cross-layer approach that configures the network based on MapReduce job dynamics at runtime.…”
Section: Related Work On Big Data and Sdnmentioning
confidence: 99%
See 1 more Smart Citation
“…So far, in the Big Data scenario, most works have leveraged SDN to optimize the communication-intensive phase of Hadoop MapReduce [15] by placing MapReduce tasks close to their data, thus reducing the amount of data that must be transferred and therefore the MapReduce job completion time [29,38,43,44]. A first work that explores the tight integration of application and network control utilizing SDN has been presented by Wang et al [43], which explores the idea of application-aware networking through the design of an SDN controller using a cross-layer approach that configures the network based on MapReduce job dynamics at runtime.…”
Section: Related Work On Big Data and Sdnmentioning
confidence: 99%
“…Their experimental results show a 14-38% improvement in query execution time over a traditional approach that optimizes task and flow scheduling without SDN collaboration. Qin et al in [38] propose a heuristic bandwidth-aware task scheduler that combines Hadoop with the bandwidth control capability offered by SDN with the goal to minimize the completion time of MapReduce jobs.…”
Section: Related Work On Big Data and Sdnmentioning
confidence: 99%
“…In this study, logs are collected through log agents in the software from the network security module, authentication module, and network control module, and as the software can collect information related to the devices and network, the environment can be more simplified than that comprised of complicated devices in the existing network [17][18][19]. Figure 8 shows an example of the big-data security analysis architecture of the proposed security framework.…”
Section: (I) Traffic Collectormentioning
confidence: 99%
“…There have been several studies [1], [2], pointing the benefits of Software Defined Networking (SDN) for handling network impact on big data workloads. Typically they address the network impact from communication patterns of Hadoop®, which is the most popular big data framework owing to its availability and reliability advantage.…”
Section: Introductionmentioning
confidence: 99%
“…Behavior and performance of Hadoop® clusters in datacenters is effected by size of nodes, data size and workload types as well as networking characteristics. Network speed and latency play an important role in Hadoop® job completion times but more importantly they are impacted by the availability and resiliency features, traffic bursting nature and subscription ratio.There have been several studies [1], [2], pointing the benefits of Software Defined Networking (SDN) for handling network impact on big data workloads. Typically they address the network impact from communication patterns of Hadoop®, which is the most popular big data framework owing to its availability and reliability advantage.…”
mentioning
confidence: 99%