Proceedings of the 1995 ACM/IEEE Conference on Supercomputing (CDROM) - Supercomputing '95 1995
DOI: 10.1145/224170.224306
|View full text |Cite
|
Sign up to set email alerts
|

Balancing processor loads and exploiting data locality in N-body simulations

Abstract: Although N-body simulation algorithms are amenable to parallelization, performance gains from execution on parallel machines are di cult to obtain due to load imbalances caused by irregular distributions of bodies. In general, there is a tension between balancing processor loads and maintaining locality, as the dynamic re-assignment o f w ork necessitates access to remote data. Fractiling is a dynamic scheduling scheme that simultaneously balances processor loads and maintains locality b y exploiting the self-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
47
0

Year Published

1997
1997
2015
2015

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 51 publications
(47 citation statements)
references
References 20 publications
0
47
0
Order By: Relevance
“…Exemplar runtime systems implementing this approach are Zoltan [3], Chombo [15], and Charm++ [10]. Similar schemes have also been proposed and used in MPI applications [16,17]. This paper proposes concepts which build upon these existing frameworks in order to make decisions related to load balancing to get good performance.…”
Section: Previous Workmentioning
confidence: 99%
“…Exemplar runtime systems implementing this approach are Zoltan [3], Chombo [15], and Charm++ [10]. Similar schemes have also been proposed and used in MPI applications [16,17]. This paper proposes concepts which build upon these existing frameworks in order to make decisions related to load balancing to get good performance.…”
Section: Previous Workmentioning
confidence: 99%
“…This layout is known in parallel computing as the Morton ordering and has been used for load balancing purposes [5,28,29,45,51,61]. It has also been applied for bandwidth reduction in information theory [6], for graphics applications [24,39], and for database applications [30].…”
Section: The Morton Layout Lmomentioning
confidence: 99%
“…Such restructuring techniques have been studied for pointer-based data structures, such as heaps [35,37,38] and trees [13]; for profile-driven object placement [8]; for matrices with special structure (e.g., banded matrices in LAPACK [1], or sparse matrices [20]); and in parallel computing [5,28,29,45,51,61]. But when working with general dense matrices in a uniprocessor environment, most programmers are reluctant to alter the default rowmajor or column-major linearization of multidimensional arrays that high-level languages provide, even when such ordering degrades cache performance.…”
Section: Introductionmentioning
confidence: 99%
“…This layout is known in parallel computing as the Morton ordering and has been used for load balancing purposes [7,25,26,33,36,40]. It has also been applied for bandwidth reduction in information theory [9], for graphics applications [20,30], and for database applications [27].…”
Section: Algorithm 6: Non-linear Array Layoutmentioning
confidence: 99%