2017
DOI: 10.1155/2017/1425102
|View full text |Cite
|
Sign up to set email alerts
|

Handling Data Skew in MapReduce Cluster by Using Partition Tuning

Abstract: The healthcare industry has generated large amounts of data, and analyzing these has emerged as an important problem in recent years. The MapReduce programming model has been successfully used for big data analytics. However, data skew invariably occurs in big data analytics and seriously affects efficiency. To overcome the data skew problem in MapReduce, we have in the past proposed a data processing algorithm called Partition Tuning-based Skew Handling (PTSH). In comparison with the one-stage partitioning st… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
8
2

Relationship

2
8

Authors

Journals

citations
Cited by 23 publications
(7 citation statements)
references
References 19 publications
0
7
0
Order By: Relevance
“…Execute the Map Reduce jobs on dissimilar datasets, and work out the mean total percentage error (MAPE) of all partitions in every state. The MAPE is defined as follows Where N is the amount of reduce tasks in a job, and are the predicted and calculated value of partition size of reduce task i, respectively [18]. …”
Section: Resultsmentioning
confidence: 99%
“…Execute the Map Reduce jobs on dissimilar datasets, and work out the mean total percentage error (MAPE) of all partitions in every state. The MAPE is defined as follows Where N is the amount of reduce tasks in a job, and are the predicted and calculated value of partition size of reduce task i, respectively [18]. …”
Section: Resultsmentioning
confidence: 99%
“…In distributed execu tion, where each cluster of model elements is in a partition and localization, the model transformation processing can be minimized, with less network traffic overhead for sending data between executors (Worker Nodes). In both strategies there are open issues, such as data balancing (Le et al, 2014), data skew processing (Gao et al, 2017), and data locality (Jin et al, 2011) that need be contemplated in our approach.…”
Section: Executing Model Transformations Using Graphframementioning
confidence: 99%
“…In the past few years, the prevalence of big data has paved the way for applications of deep learning techniques [37], [38]. With the development of computational intelligence [39], deep learning has been successful in healthcare engineering and neuroscience, providing intelligent solutions with data volumes for significant neural image data processing and analytics. To overcome the limitation of traditional MVPA approaches and improve the performance of crosssubject decoding, Koyamada et al [13] introduced a feedforward deep neural network to classify different brain features representing of various tasks from fMRI data.…”
Section: B Deep Transfer Learningmentioning
confidence: 99%