2022
DOI: 10.48550/arxiv.2202.13293
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Past, Present and Future of Hadoop: A Survey

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 0 publications
0
4
0
Order By: Relevance
“…With the help of this tool, we can construct and execute Map Reduce tasks using any script as the mapper and reducer. The mapper uses the Hadoop streaming tool Stdin to read the data, provides the mapped key value pairs to the reducer, and uses Stdout to output the results of the reducing operation, which are then stored in the HDFS [10], The mapper phase is responsible for calculating the Euclidean distance between the training set and the target point, and the output of the mapper is a set of pairs of <Distance, Class> that serve as the reducer's input. In reducer, the minimum k distances are determined, and the class with maximum frequency will represent the predicted class.…”
Section: Hadoop Streamingmentioning
confidence: 99%
“…With the help of this tool, we can construct and execute Map Reduce tasks using any script as the mapper and reducer. The mapper uses the Hadoop streaming tool Stdin to read the data, provides the mapped key value pairs to the reducer, and uses Stdout to output the results of the reducing operation, which are then stored in the HDFS [10], The mapper phase is responsible for calculating the Euclidean distance between the training set and the target point, and the output of the mapper is a set of pairs of <Distance, Class> that serve as the reducer's input. In reducer, the minimum k distances are determined, and the class with maximum frequency will represent the predicted class.…”
Section: Hadoop Streamingmentioning
confidence: 99%
“…It has two main subprojects, including Hadoop distributed file system (HDFS) and MapReduce programming paradigm [23]. The other subprojects, such as YARN, Common, Hbase, Hive, Ozone, and Zookeeper, provide complementary services [24]. Hadoop is suited to high-throughput and in-depth analysis where a larger portion or all of the data is harnessed [25].…”
Section: Hadoop Frameworkmentioning
confidence: 99%
“…All the divisions are processed simultaneously [26] and are parsed into (key, value) pair records. Map function replicates these records and maps each of them to a set of intermediate (key, value) pairs [24]. At last, reducers combine them to get a consolidated output.…”
Section: Hadoop Frameworkmentioning
confidence: 99%
“…When compared to the conventional methodology, issue solving using metaheuristic approaches was better since the researched space's dimensions expanded. The MapReduce framework was the main emphasis of the authors' study in [13,14,42,[63][64][65], as well as its limitations, problems with job scheduling between nodes, and other algorithms presented by different academics. These algorithms were then categorized based on a variety of performance-related quality indicators in some of these studies.…”
Section: Other Improved Algorithmsmentioning
confidence: 99%