2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (Ccgrid 2012) 2012
DOI: 10.1109/ccgrid.2012.13
|View full text |Cite
|
Sign up to set email alerts
|

A Scalable Parallel Debugging Library with Pluggable Communication Protocols

Abstract: Parallel debugging faces challenges in both scalability and efficiency. A number of advanced methods have been invented to improve the efficiency of parallel debugging. As the scale of system increases, these methods highly rely on a scalable communication protocol in order to be utilized in large-scale distributed environments. This paper describes a debugging middleware that provides fundamental debugging functions supporting multiple communication protocols. Its pluggable architecture allows users to select… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 18 publications
0
4
0
Order By: Relevance
“…Merging ranges takes advantage of the tree's topology. We utilize MRNet's back-end attachment mode [8] to initialize a communication tree, which guarantees the MPI ranks are ordered sequentially across the leaf nodes. Therefore, the MPI rank / thread id space owned by any sub-tree on the same level has no intersection, as shown in Figure 5.…”
Section: Scalable Collection Of Comparison Resultsmentioning
confidence: 99%
“…Merging ranges takes advantage of the tree's topology. We utilize MRNet's back-end attachment mode [8] to initialize a communication tree, which guarantees the MPI ranks are ordered sequentially across the leaf nodes. Therefore, the MPI rank / thread id space owned by any sub-tree on the same level has no intersection, as shown in Figure 5.…”
Section: Scalable Collection Of Comparison Resultsmentioning
confidence: 99%
“…GTI provides a pluggable communication system that allows the use of different communication protocols. An extension of this approach could allow GTI to utilize existing TBON communication systems as in SPDL [17].…”
Section: Related Workmentioning
confidence: 98%
“…The closest near matches are works on debugging the programs running on supercomputers [9], the BigDebugger for debugging MapReduce applications for Big Data analysis [11], and two more recent developments: Cloud Debugger for Google's Cloud Platform for microservices [14] and Squash [17]. They are discussed below.…”
Section: Conclusion Of the Experimentsmentioning
confidence: 99%
“…To enable the debugging of parallel programs running on supercomputer architectures, Jin et al [9] developed a code library for launching a debug facility simultaneously on the nodes in a supercomputer from the front-end, and sending data collected back to the developer's workstation. The main function of the library code is the communication between the front-end and back-end in supercomputer systems.…”
Section: Conclusion Of the Experimentsmentioning
confidence: 99%