2020
DOI: 10.21203/rs.3.rs-43526/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Comprehensive Performance Analysis of Apache Hadoop and Apache Spark for Large Scale Data Sets Using HiBench

Abstract: In recent times Big Data analytics has got tremendous attention and it involves storing, processing, and analysing large scale datasets. The advent of distributed computing frameworks such as Hadoop and Spark offers an efficient solution to analyse vast amounts of data. Due to the availability of an application program ming interface (API) and its performance, Spark become very popular, even more popular than the MapReduce framework. Both these frameworks have more than 150 parameters and the combination of th… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 18 publications
0
4
0
Order By: Relevance
“…It was observed that 58% of the reviewed papers [15, were based on BD predictive analytics, 18% BD prescriptive analytics [103][104][105] , 11% BD descriptive analytics [15, , whiles 9% (A+B) [100][101][102][103][104][105] and 5% (B+C) [106][107][108] . Few studies in BDA used prescriptive analytics (see Table A1 in Appendix); this can be attributed to fact that big data prescriptive analytics is in its early stage.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…It was observed that 58% of the reviewed papers [15, were based on BD predictive analytics, 18% BD prescriptive analytics [103][104][105] , 11% BD descriptive analytics [15, , whiles 9% (A+B) [100][101][102][103][104][105] and 5% (B+C) [106][107][108] . Few studies in BDA used prescriptive analytics (see Table A1 in Appendix); this can be attributed to fact that big data prescriptive analytics is in its early stage.…”
Section: Methodsmentioning
confidence: 99%
“…Hence, it is not a shock to see such a massive study in the healthcare industry (30%), followed by anomaly detection (11%), cybersecurity, data privacy & IoT (5%) and automobile and transportation (5%); (see Table A1 in Appendix). Concerning the data size, some studies indicated their data size in terms of the storage space ranging from 708 MB [72] to 600 GB [103] , while in terms of the number of observations, it ranges from 1789 [93] to 3 billion records [112] . Based on the data size (e.g., 708 MB [72] ), one can say that the study by Jallad et al [72] is not related to big data.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations