2012 IEEE 26th International Parallel and Distributed Processing Symposium 2012
DOI: 10.1109/ipdps.2012.41
|View full text |Cite
|
Sign up to set email alerts
|

Holistic Debugging of MPI Derived Datatypes

Abstract: Abstract-The Message Passing Interface (MPI) specifies an API that allows programmers to create efficient and scalable parallel applications. The standard defines multiple constraints for each function parameter. For performance reasons, no MPI implementation checks all of these constraints at runtime. Derived datatypes are an important concept of MPI and allow users to describe an application's data structures for efficient and convenient communication. Using existing infrastructure we present scalable algori… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2012
2012
2017
2017

Publication Types

Select...
5
1

Relationship

3
3

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 10 publications
0
4
0
Order By: Relevance
“…At this point we cannot provide reliable overhead measurements, therefore we refer to the published results for the analysis of MPI applications. Overhead resulting from data race analysis is reported in [11] to be less than 80 % for a worst case benchmark with high-frequent collective communication, overhead for deadlock detection is reported in [7] to be less than 34 % in most cases, but might be higher in case where error is detected. For full data race analysis in multi-threaded XMP applications, we expect the overhead to be in the range of 2-20x as reported for Archer in [6].…”
Section: Discussionmentioning
confidence: 99%
“…At this point we cannot provide reliable overhead measurements, therefore we refer to the published results for the analysis of MPI applications. Overhead resulting from data race analysis is reported in [11] to be less than 80 % for a worst case benchmark with high-frequent collective communication, overhead for deadlock detection is reported in [7] to be less than 34 % in most cases, but might be higher in case where error is detected. For full data race analysis in multi-threaded XMP applications, we expect the overhead to be in the range of 2-20x as reported for Archer in [6].…”
Section: Discussionmentioning
confidence: 99%
“…While the simplest solution would be to provide memory addresses, this provides unsatisfactory details on where the error resides in a communication buffer and its associated MPI datatype. We currently use a path expression approach [5] to pin-point these error locations. An example for this path expression can be found in Section 1.5.…”
Section: Example 2: Viewing Datatype Related Problemsmentioning
confidence: 99%
“…We develop the Marmot Umpire Scalable Tool (MUST), named after its predecessor tools Marmot [2] and Umpire [3], for this purpose. Recent advances in runtime deadlock detection [4] and datatype correctness checks [5] allow MUST to efficiently detect complex errors. However, detecting such errors is only half the solution to the overall problem.…”
Section: Introductionmentioning
confidence: 99%
“…The first tool provides basic profiling information for execution phases, while the second detects lost messages of an MPI application at runtime. Both tools use the scalability features that GTI offers, while a third GTI-based tool exists that automatically detects usage errors of MPI datatypes [11]. We introduce the two example tools and their use of GTI in this section, while we present performance results in Section VII.…”
Section: Case Studiesmentioning
confidence: 99%