SC14: International Conference for High Performance Computing, Networking, Storage and Analysis 2014
DOI: 10.1109/sc.2014.75
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing Data Locality for Fork/Join Programs Using Constrained Work Stealing

Abstract: Abstract-We present an approach to improving data locality across different phases of fork/join programs scheduled using work stealing. The approach consists of: (1) user-specified and automated approaches to constructing a steal tree, the schedule of steal operations, and (2) constrained work-stealing algorithms that constrain the actions of the scheduler to mirror a given steal tree. These are combined to construct work-stealing schedules that maximize data locality across computation phases while ensuring l… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
25
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 31 publications
(25 citation statements)
references
References 33 publications
0
25
0
Order By: Relevance
“…In [15], He et al proposed a bilinear quarter approximation strategy for fractional motion estimation design together with a data reuse strategy for ultrahigh definition video applications. Lifflander et al in [16] presented a work-stealing algorithm for fork-/join-based parallel programming models to gain performance boost-up though improving the data locality property.…”
Section: Related Workmentioning
confidence: 99%
“…In [15], He et al proposed a bilinear quarter approximation strategy for fractional motion estimation design together with a data reuse strategy for ultrahigh definition video applications. Lifflander et al in [16] presented a work-stealing algorithm for fork-/join-based parallel programming models to gain performance boost-up though improving the data locality property.…”
Section: Related Workmentioning
confidence: 99%
“…On the other hand, the relative impact of the task scheduler overhead is larger on fine-grained tasks, thus processor cores may spend more time in scheduling tasks than executing application code, which slows down the application progress. Finally, sophisticated schedulers, such as locality-aware schedulers [5], [6], have the potentiality of increasing application performance by speeding up the execution of single tasks. However, these schedulers introduce larger runtime overhead than best-effort schedulers currently used in Cilk [7] or Intel Threading Building Blocks (TBB) [8].…”
Section: Motivationmentioning
confidence: 99%
“…Given such difficulties, it comes to no surprise that previous studies have looked at either application algorithms running on a relative small set of available programming model runtimes, or one/two specific applications implemented in different ways to generate different task granularities on a single runtime system [6], [9], [10]. Either approach limits the spectrum of possible configuration that can be analyzed and cannot be used to study the impact of scheduling overhead of task schedulers that have not yet been implemented.…”
Section: Motivationmentioning
confidence: 99%
See 1 more Smart Citation
“…Our work also seeks to reduce the amount of control data required, but it solves a more general problem not specific to a certain programming paradigm. Specific schemes for reducing control-related data have been studied extensively for a wide variety of algorithms [18]- [22].…”
Section: Introductionmentioning
confidence: 99%