2020
DOI: 10.1007/978-3-030-49345-5_9
|View full text |Cite
|
Sign up to set email alerts
|

Intelligent Data Compression Policy for Hadoop Performance Optimization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 11 publications
(2 citation statements)
references
References 6 publications
0
2
0
Order By: Relevance
“…This may lead to a potential network bottleneck, since the reduce phase may require a lot of shuffled data from the mappers, which may be executed across different nodes in various racks. This may incur severe performance penalty due to run time data transfers [10]. It is obvious in a heterogeneous scenario that some nodes may complete more maps than others; the extent of difference varying depending upon the inherent heterogeneity of the Hadoop cluster.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…This may lead to a potential network bottleneck, since the reduce phase may require a lot of shuffled data from the mappers, which may be executed across different nodes in various racks. This may incur severe performance penalty due to run time data transfers [10]. It is obvious in a heterogeneous scenario that some nodes may complete more maps than others; the extent of difference varying depending upon the inherent heterogeneity of the Hadoop cluster.…”
Section: Introductionmentioning
confidence: 99%
“…Majority of research attempts towards Hadoop's performance improvements have focused on optimizing performance of the map phase through effective data placements preceding map operations [9,[11][12][13][14], primarily by means of some mechanism of grouping related data block and colocating them within the same cluster node. However, for many applications the amount of intermediate data after a map phase is huge [10]. Trivial implementation of copy | shuffle phase followed by arbitrary reducer decisioning in terms of number of reducers a selecting nodes as reducers are often found inadequate.…”
Section: Introductionmentioning
confidence: 99%