2015
DOI: 10.5121/ijcsit.2015.7410
|View full text |Cite
|
Sign up to set email alerts
|

Map-Reduce Implementations: Survey and Performance Comparison

Abstract: Map Reduce has gained remarkable significance as a prominent parallel data processing tool in the research community, academia and industry with the spurt in volume of data that is to be analyzed. Map Reduce is used in different applications such as data mining, data analytics where massive data analysis is required, but still it is constantly being explored on different parameters such as performance and efficiency. This survey intends to explore large scale data processing using MapReduce and its various imp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 10 publications
0
5
0
Order By: Relevance
“…Data is nowadays stored and processed in distributed systems, often in a geographically dispersed manner. This introduces a layer of complexity that MapReduce frameworks, such as Apache Spark, excel at handling [61]. Container engines, and in particular Docker, are also becoming an essential part of bioinformatics pipelines as they improve delivery, interoperability and reproducibility of scientific analyses.…”
Section: Discussionmentioning
confidence: 99%
“…Data is nowadays stored and processed in distributed systems, often in a geographically dispersed manner. This introduces a layer of complexity that MapReduce frameworks, such as Apache Spark, excel at handling [61]. Container engines, and in particular Docker, are also becoming an essential part of bioinformatics pipelines as they improve delivery, interoperability and reproducibility of scientific analyses.…”
Section: Discussionmentioning
confidence: 99%
“…The first one is a data storage called Hadoop DFS (HDFS), whereas the other is an information process called (Hadoop MapReduce Framework). Hadoop DFS is a block-structured filing system controlled by only one master node as Google's GFS [18].…”
Section: A Volume: This Criterion Represents the Most Immedi-mentioning
confidence: 99%
“…The authors Zeba Khanam and Shafali Agarwal, in their paper [3], explore large scale data processing using MapReduce and its various implementations to facilitate the database, researchers and other communities in developing the technical understanding of the MapReduce framework. They continue with exploring different MapReduce implementations; most popular Hadoop implementations and other similar implementations using other platforms and compare those based on different parameters.…”
Section: Related Workmentioning
confidence: 99%