Holistic Debugging of MPI Derived Datatypes

Protze, Joachim; Hilbrich, Tobias; Knüpfer, Andreas; Supinski, Bronis R. de; Müller, Matthias S.

doi:10.1109/ipdps.2012.41

Cited by 7 publications

(4 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…At this point we cannot provide reliable overhead measurements, therefore we refer to the published results for the analysis of MPI applications. Overhead resulting from data race analysis is reported in [11] to be less than 80 % for a worst case benchmark with high-frequent collective communication, overhead for deadlock detection is reported in [7] to be less than 34 % in most cases, but might be higher in case where error is detected. For full data race analysis in multi-threaded XMP applications, we expect the overhead to be in the range of 2-20x as reported for Archer in [6].…”

Section: Discussionmentioning

confidence: 99%

Runtime Correctness Checking for Emerging Programming Paradigms

Protze

Terboven

Müller

et al. 2017

Proceedings of the First International Workshop on Software Correctness for HPC Applications

Self Cite

View full text Add to dashboard Cite

With rapidly increasing concurrency, the HPC community is looking for new parallel programming paradigms to make best use of current and up-coming machines. Under the Japanese CREST funding program, the post-petascale HPC project developed the XMP programming paradigm, a pragma-based partitioned global address space (PGAS) approach. Good tool support for debugging and performance analysis is crucial for the productivity and therefore acceptance of a new programming paradigm. In this work we investigate which properties of a parallel programing language specification may help tools to highlight correctness and performance issues or help to avoid common issues in parallel programming in the first place. In this paper we exercise these investigations on the example of XMP. We also investigate the question how to improve the reusability of existing correctness and performance analysis tools. CCS CONCEPTS• Software and its engineering → Correctness; Parallel programming languages; Software maintenance tools; Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). MOTIVATIONIn this paper we discuss three research questions: First, which properties of a language or parallelization paradigm are required to enable effective automatic correctness checking and possibly to avoid errors in the first place; second, how can existing specifications or APIs be extended to provide the necessary semantic information for the correctness checking tool; finally, to what extent can existing tools be reused and mapped to such a new programming paradigm.We exercise this evaluation based on XcalableMP (XMP). XMP is a partitioned global address space (PGAS) approach driven by the Japanese exascale initiative.As a PGAS aproach, XMP combines paradigms from shared and distributed memory programming; XMP also utilizes pragma based directives as well as API functions and base-language extensions. With these properties, XMP represents most features that a PGAS, distributed or shared memory programming paradigm would provide. This means similar analysis can be applied to many existing or up-coming parallel programming approaches.For correctness analysis, we distinguish three classes of analysis:

show abstract

Section: Discussionmentioning

confidence: 99%

Runtime Correctness Checking for Emerging Programming Paradigms

Protze

Terboven

Müller

et al. 2017

Proceedings of the First International Workshop on Software Correctness for HPC Applications

Self Cite

View full text Add to dashboard Cite

show abstract

“…While the simplest solution would be to provide memory addresses, this provides unsatisfactory details on where the error resides in a communication buffer and its associated MPI datatype. We currently use a path expression approach [5] to pin-point these error locations. An example for this path expression can be found in Section 1.5.…”

Section: Example 2: Viewing Datatype Related Problemsmentioning

confidence: 99%

“…We develop the Marmot Umpire Scalable Tool (MUST), named after its predecessor tools Marmot [2] and Umpire [3], for this purpose. Recent advances in runtime deadlock detection [4] and datatype correctness checks [5] allow MUST to efficiently detect complex errors. However, detecting such errors is only half the solution to the overall problem.…”

Section: Introductionmentioning

confidence: 99%

MPI Runtime Error Detection with MUST: Advanced Error Reports

Protze

Hilbrich

Supinski

et al. 2013

Tools for High Performance Computing 2012

Self Cite

View full text Add to dashboard Cite

“…The first tool provides basic profiling information for execution phases, while the second detects lost messages of an MPI application at runtime. Both tools use the scalability features that GTI offers, while a third GTI-based tool exists that automatically detects usage errors of MPI datatypes [11]. We introduce the two example tools and their use of GTI in this section, while we present performance results in Section VII.…”

Section: Case Studiesmentioning

confidence: 99%

GTI: A Generic Tools Infrastructure for Event-Based Tools in Parallel Systems

Hilbrich

Müller

Supinski

et al. 2012

2012 IEEE 26th International Parallel and Distributed Processing Symposium

Self Cite

View full text Add to dashboard Cite

Abstract-Runtime detection of semantic errors in MPI applications supports efficient and correct large-scale application development. However, current approaches scale to at most one thousand processes and design limitations prevent increased scalability. The need for global knowledge for analyses such as type matching, and deadlock detection presents a major challenge. We present a scalable tool infrastructure -the Generic Tool Infrastructure (GTI) -that we will use to implement MPI runtime error detection tools and that applies to other use cases. GTI supports simple offloading of tool processing onto extra processes or threads and provides a tree based overlay network (TBON) for creating scalable tools that analyze global knowledge. We present its abstractions and code generation facilities that ease many hurdles in tool development, including wrapper generation, tool communication, trace reductions, and filters. GTI ultimately allows tool developers to focus on implementing tool functionality instead of the surrounding infrastructure. Further, we demonstrate that GTI supports scalable tool development through a lost message detector and a phase profiler. The former provides a more scalable implementation of important base functionality for MPI correctness checking, while the latter tool demonstrates that GTI can serve as the basis of further types of tools. Experiments with up to 2048 cores show that GTI's scalability features apply to both tools.

show abstract

Holistic Debugging of MPI Derived Datatypes

Cited by 7 publications

References 10 publications

Runtime Correctness Checking for Emerging Programming Paradigms

Runtime Correctness Checking for Emerging Programming Paradigms

MPI Runtime Error Detection with MUST: Advanced Error Reports

GTI: A Generic Tools Infrastructure for Event-Based Tools in Parallel Systems

Contact Info

Product

Resources

About