Proceedings of the 2001 ACM/IEEE Conference on Supercomputing 2001
DOI: 10.1145/582034.582074
|View full text |Cite
|
Sign up to set email alerts
|

Scalable parallel application launch on Cplant™

Abstract: This paper describes the components of a runtime system for launching parallel applications and presents performance results for starting a job on more than a thousand nodes of a workstation cluster. This runtime system was developed at Sandia National Laboratories as part of the Computational Plant (Cplant TM ) project, which is deploying large-scale parallel computing clusters using commodity hardware and the Linux operating system. We have designed and implemented a flexible runtime system that allows for l… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
23
0

Year Published

2004
2004
2016
2016

Publication Types

Select...
4
3
3

Relationship

1
9

Authors

Journals

citations
Cited by 27 publications
(23 citation statements)
references
References 8 publications
0
23
0
Order By: Relevance
“…Brightwell et al [8] present the components of the runtime system for parallel application launch on Cplant project. They do not assume that the executable to be launched is available on a global file system.…”
Section: Related Workmentioning
confidence: 99%
“…Brightwell et al [8] present the components of the runtime system for parallel application launch on Cplant project. They do not assume that the executable to be launched is available on a global file system.…”
Section: Related Workmentioning
confidence: 99%
“…MPD (Butler et al 2003) uses a group of daemons, arranged in a ring topology, to scalably start MPICH2 MPI processes. Yod (Brightwell and Fisk 2001) provides similar capabilities in the Cplant software stack.…”
Section: Related Workmentioning
confidence: 99%
“…[1], [2], [3], [4], [5], [6] For many important applications, the quantity of data to be processed exceeds the storage capacity or reasonable transfer times that can be achieved on conventional systems. For these large-data applications, a variety of techniques such as disk aggregation [7], [8], [9], [10], parallel network transfer [11], [12], peer-to-peer distribution [13], and multi-level storage management [2] are essential for constructing usable grid computing systems.…”
Section: Introductionmentioning
confidence: 99%