Reducing data distribution bottlenecks by employing data visualization filters

Franke, Ernest A.; Magee, M.

doi:10.1109/hpdc.1999.805305

Cited by 6 publications

(6 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There are a number of interesting uses for overlay networks. For example, overlay networks could (1) filter input data in an applicationspecific way, saving network bandwidth and compute-node memory [24,4,20,25]; (2) efficiently route data to compute nodes using data-dependent mapping functions, for example in applications with data-dependent decomposition of unstructured data [24]; and (3) process in-flight data sets to transform data into a format that matches the needs of the computation or a particular data distribution, for example to convert time-series data into frequency data for seismic imaging [31]. figure (a) shows n-to-1 (shared-file) and n-to-n (file-per-process) write performance of Lustre compared to the n-to-n write performance of LWFS.…”

Section: Overlay Networkmentioning

confidence: 99%

Lightweight storage and overlay networks for fault tolerance.

Oldfield¹

2010

View full text Add to dashboard Cite

The next generation of capability-class, massively parallel processing (MPP) systems is expected to have hundreds of thousands to millions of processors, In such environments, it is critical to have fault-tolerance mechanisms, including checkpoint/restart, that scale with the size of applications and the percentage of the system on which the applications execute. For application-driven, periodic checkpoint operations, the state-of-the-art does not provide a scalable solution. For example, on today's massive-scale systems that execute applications which consume most of the memory of the employed compute nodes, checkpoint operations generate I/O that consumes nearly 80% of the total I/O usage. Motivated by this observation, this project aims to improve I/O performance for application-directed checkpoints through the use of lightweight storage architectures and overlay networks. Lightweight storage provide direct access to underlying storage devices. Overlay networks provide caching and processing capabilities in the compute-node fabric. The combination has potential to signifcantly reduce I/O overhead for large-scale applications. This report describes our combined efforts to model and understand overheads for application-directed checkpoints, as well as implementation and performance analysis of a checkpoint service that uses available compute nodes as a network cache for checkpoint operations.3

show abstract

Section: Overlay Networkmentioning

confidence: 99%

Lightweight storage and overlay networks for fault tolerance.

Oldfield¹

2010

View full text Add to dashboard Cite

show abstract

“…If the distribution of data across clients or across disks is dependent on the value of the data, moving that function to the data server can halve network traffic [22]. Processors near the data servers can filter data in an application-specific way, passing only the necessary data on to the clients, saving network bandwidth and client memory [10,[22][23][24]. Processors near the data servers can exchange blocks without passing the data through clients, e.g., to rearrange blocks between disks during a copy or permutation operation.…”

Section: Remote Processing Of Application Codementioning

confidence: 99%

“…For example, seismic data, used to extract images of the subsurface, requires a variety of processing steps to filter and transform data before computation [2]. Data-intensive applications also exist in climate modeling [3,4] physics and astronomy [5], biology and chemistry [6,7], visualization [8][9][10], and many others.…”

Section: Introductionmentioning

confidence: 99%

Armada: a parallel I/O framework for computational grids

Oldfield

Kotz

2002

Future Generation Computer Systems

View full text Add to dashboard Cite

“…Applications can distribute file data to compute nodes using a datadependent mapping function, for example, in applications with a data-dependent decomposition of unstructured data [Kot95]. I/O nodes can filter data in an application-specific way, passing only the necessary data on to the compute node, saving network bandwidth and compute-node memory [Kot95,BP88,FM99,KCFS99]. I/O nodes can exchange blocks without passing the data through compute nodes, for example, to rearrange blocks between disks during a copy or permutation operation.…”

Section: The Need For Remote Application Codementioning

confidence: 99%

Efficient I/O for computational grid applications.

Oldfield¹,

Kotz²

View full text Add to dashboard Cite

High-performance computing increasingly occurs on "computational grids" composed of heterogeneous and geographically distributed systems of computers, networks, and storage devices that collectively act as a single "virtual" computer. A key challenge in this environment is to provide efficient access to data distributed across remote data servers. This dissertation explores some of the issues associated with I/O for wide-area distributed computing and describes an I/O system, called Armada, with the following features: a framework to allow application and dataset providers to flexibly compose graphs of processing modules that describe the distribution, application interfaces, and processing required of the dataset before or after computation; an algorithm to restructure application graphs to increase parallelism and to improve network performance in a wide-area network; and a hierarchical graph-partitioning scheme that deploys components of the application graph in a way that is both beneficial to the application and sensitive to the administrative policies of the different administrative domains. Experiments show that applications using Armada perform well in both low-and high-bandwidth environments, and that our approach does an exceptional job of hiding the network latency inherent in grid computing. ii Thanks to my parents for their love and encouragement over the years. I can only hope to be as much of an inspiration to my children as they were to me. Thanks to David Kotz for his guidance and friendship. He has a great instinct for knowing when to provide direction, when to provide encouragement, and when to get out of the way-all are qualities of a great advisor. Thanks to Tom Cormen for his car, two pairs of tennis shoes, his pit smoker, and for teaching me that nothing is more important than good barbecue. Thanks to the many graduate students that made my time at Dartmouth an enjoyable experience. Special thanks to Clint Hepner (my fishing companion), B.J. Premore (my hiking companion), and Senthil Periaswamy, a great friend and colleague. Thanks to the many people at Sandia National Laboratories that provided guidance (and funding) for my research. Special thanks to Jeff Nelson, Bill Camp, and David Womble. They are each great mentors and deserve much of the credit for my success. Thanks to Jay Lepreau, and the students and staff who run the Emulab at the University of Utah, for allowing us to use their facility to run our experiments. Most of all, thanks to my wife Susan for her loving support over the past six years.

show abstract

Reducing data distribution bottlenecks by employing data visualization filters

Cited by 6 publications

References 0 publications

Lightweight storage and overlay networks for fault tolerance.

Lightweight storage and overlay networks for fault tolerance.

Armada: a parallel I/O framework for computational grids

Efficient I/O for computational grid applications.

Contact Info

Product

Resources

About