R3s

Yan, Xinan; Wong, Bernard; Choy, Sharon

doi:10.1145/3008167.3008171

Proceedings of the 15th International Workshop on Adaptive and Reflective Middleware 2016

DOI: 10.1145/3008167.3008171

|View full text |Cite

R3s

Xinan Yan

Bernard Wong

Sharon Choy

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2023

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Optimizing data locality by executor allocation in spark computing environment

Tang

et al. 2023

ComSIS

View full text Add to dashboard Cite

Data locality is an important concept in big data processing. Most of the existing research optimized data locality from the aspect of task scheduling. However, as the execution container of tasks, the executors started on which nodes can directly affect the locality level achieved by the tasks. This paper tries to improve the data locality by executor allocation for reduce stage in Spark computing environment. Firstly, we calculate the network distance matrix of executors and formulate an optimal executor allocation problem to minimize the total communication distance. Then, when the network distance between executors satisfies the triangular inequality, an approximate algorithm is proposed; and when the network distance between executors does not satisfy the triangular inequality, a greedy algorithm is proposed. Finally, we evaluate the performance of our algorithms in a practical Spark cluster by using several representative micro-benchmarks (Sort and Join) and macro-benchmarks (PageRank and LDA). Experimental results show that the proposed algorithms can decrease the execution time of tasks for lower data communication.

show abstract

Optimizing data locality by executor allocation in spark computing environment

Tang

et al. 2023

ComSIS

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

R3s

Cited by 1 publication

References 5 publications

Optimizing data locality by executor allocation in spark computing environment

Optimizing data locality by executor allocation in spark computing environment

Contact Info

Product

Resources

About