2010
DOI: 10.1007/978-3-642-15646-5_24
|View full text |Cite
|
Sign up to set email alerts
|

Load Balancing for Regular Meshes on SMPs with MPI

Abstract: Domain decomposition for regular meshes on parallel computers has traditionally been performed by attempting to exactly partition the work among the available processors (now cores). However, these strategies often do not consider the inherent system noise which can hinder MPI application scalability to emerging peta-scale machines with 10000+ nodes. In this work, we suggest a solution that uses a tunable hybrid static/dynamic scheduling strategy that can be incorporated into current MPI implementations of mes… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
18
0

Year Published

2011
2011
2023
2023

Publication Types

Select...
3
2
2

Relationship

3
4

Authors

Journals

citations
Cited by 12 publications
(19 citation statements)
references
References 5 publications
1
18
0
Order By: Relevance
“…The above characterization confirms the benefits of the solution proposed in [6]. When a platform has several different noise events of different lengths, a dynamic scheduling strategy with an assortment of task granularities can be used.…”
Section: Architectures Consideredsupporting
confidence: 73%
See 2 more Smart Citations
“…The above characterization confirms the benefits of the solution proposed in [6]. When a platform has several different noise events of different lengths, a dynamic scheduling strategy with an assortment of task granularities can be used.…”
Section: Architectures Consideredsupporting
confidence: 73%
“…Our initial work shows that the performance improvement of an application is relatively unnoticeable when running on a small number of nodes of a cluster, but becomes much more dramatic as we scale the application to a large number of nodes [6].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…V. Kale et al suggested a hybrid static/dynamic approach in [16] that can be incorporated into current MPI implementations of structured grid codes to improve the load balancing of the initial static decompositions. This work embraces the fundamental principles advocated in [16], and applies it in the context of dense matrix factorizations. Xue et al introduced an approach in [24] that improves the data locality when executing loop iterations in codes.…”
Section: Related Workmentioning
confidence: 99%
“…In order to be scalable for future high-performance clusters(i.e. exascale), the code running within a node of a cluster must be tuned such that it achieves not simply "high-performance", but also "performance consistency" [14], [16]. Such static tuning techniques provide few guarantees on performance consistency.…”
Section: Introductionmentioning
confidence: 99%