2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP) 2019
DOI: 10.1109/empdp.2019.8671619
|View full text |Cite
|
Sign up to set email alerts
|

Dynamic Loop Scheduling Using MPI Passive-Target Remote Memory Access

Abstract: Scientific applications often contain large computationally-intensive parallel loops. Loop scheduling techniques aim to achieve load balanced executions of such applications. For distributed-memory systems, existing dynamic loop scheduling (DLS) libraries are typically MPI-based, and employ a master-worker execution model to assign variably-sized chunks of loop iterations. The master-worker execution model may adversely impact performance due to the master-level contention. This work proposes a distributed chu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
12
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
1

Relationship

3
1

Authors

Journals

citations
Cited by 4 publications
(13 citation statements)
references
References 27 publications
1
12
0
Order By: Relevance
“…Implementation Approaches: Hierarchical DLS techniques can be implemented either using the hierarchical master-worker [13] or using the distributed chunk-calculation model [15]. The present work evaluates the use of two different implementations, MPI+OpenMP and MPI+MPI, to complement the distributed chunk-calculation approach (see Section 2).…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…Implementation Approaches: Hierarchical DLS techniques can be implemented either using the hierarchical master-worker [13] or using the distributed chunk-calculation model [15]. The present work evaluates the use of two different implementations, MPI+OpenMP and MPI+MPI, to complement the distributed chunk-calculation approach (see Section 2).…”
Section: Methodsmentioning
confidence: 99%
“…Recently, a distributed chunk-calculation approach was proposed for developing DLS techniques executing on distributed-memory systems [15]. This approach eliminated the use of the master-worker model by exploiting the one-sided communication features offered in the MPI-3 standard.…”
Section: Background and Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Although centralizing the chunk assignment does not mean centralizing the chunk calculation, many of the recent DLS techniques employ a master-worker execution model that centralizes both the chunk calculation and the chunk assignment at the master side [6,7,8,9,10]. The current work extends our earlier distributed chunk calculation approach (DCA) [11] and makes the following unique contributions.…”
Section: Introductionmentioning
confidence: 94%