2015 IEEE 31st International Conference on Data Engineering 2015
DOI: 10.1109/icde.2015.7113338
|View full text |Cite
|
Sign up to set email alerts
|

Automatic tuning of bag-of-tasks applications

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 34 publications
0
7
0
Order By: Relevance
“…A natural choice to this end is to employ sampling, e.g., as in [13], [14]. However, sampling-based automated profile generation seems to be a particularly challenging task in Spark.…”
Section: Discussion On the Provision Of End-to-end Solutionsmentioning
confidence: 99%
See 1 more Smart Citation
“…A natural choice to this end is to employ sampling, e.g., as in [13], [14]. However, sampling-based automated profile generation seems to be a particularly challenging task in Spark.…”
Section: Discussion On the Provision Of End-to-end Solutionsmentioning
confidence: 99%
“…However, all these cost modeling and profiling techniques do not cover specific phenomena in Spark execution, such as super-linear speed-ups for small degrees of parallelism and performance degradation for large ones. The proposals in [13], [14] present a sampling-based approach to estimate the profile of a single embarrassingly parallel task, based on the behavior of some of its partitions. However, they assume that partitions are scheduled in multiple waves, whereas we have adopted a configuration, where all partitions are scheduled in a single wave but there are multiple interdependent tasks.…”
Section: Related Workmentioning
confidence: 99%
“…Since the motifs search space is a combinatorial tree, it is logically partitioned into many sub-trees. In analytics workload, the number of sub-trees affects the utilization of the computing resources [20]. On a supercomputer, the StarQL optimizer estimates the query workload using a sampling technique and determines that 2,048 cores can be fully utilized.…”
Section: Parallel Support For Starql Operationsmentioning
confidence: 99%
“…In order to utilize large infrastructures, it is critical to find the best decomposition and to accurately estimate runtimes. StarDB adopts our automatic tuning framework [8] to decide the problem decomposition and estimate serial and parallel runtimes. Random sample tasks are used to model the workload of different decompositions.…”
Section: Indexing and Large-scale Parallelismmentioning
confidence: 99%
“…StarDB uses our novel data structures [9] and parallel string algorithms [10] to natively facilitate large-scale analytics for strings. We incorporate our automatic tuning framework for large infrastructures [8] to meet users time and budget constraints. StarDB allows users to easily form complex string queries.…”
Section: Introductionmentioning
confidence: 99%