DASH: a Recipe for a Flash-based Data Intensive Supercomputer

He, Jiahua; Jagatheesan, Arun; Gupta, Sandeep; Bennett, Jeffrey; Snavely, Allan

doi:10.1109/sc.2010.16

Cited by 37 publications

(12 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our measurements and other recent work [13] shows that software RAID provides better performance for SSD-based arrays than hardware controllers, because the processors on hardware RAID controllers become a bottleneck. Therefore, we use software RAID for this array.…”

Section: Raid-ssdmentioning

confidence: 55%

Understanding the Impact of Emerging Non-Volatile Memories on High-Performance, IO-Intensive Computing

Caulfield

Coburn

Mollov

et al. 2010

2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis

Self Cite

116

View full text Add to dashboard Cite

Abstract-Emerging storage technologies such as flash memories, phase-change memories, and spin-transfer torque memories are poised to close the enormous performance gap between disk-based storage and main memory. We evaluate several approaches to integrating these memories into computer systems by measuring their impact on IO-intensive, database, and memory-intensive applications. We explore several options for connecting solid-state storage to the host system and find that the memories deliver large gains in sequential and random access performance, but that different system organizations lead to different performance trade-offs. The memories provide substantial application-level gains as well, but overheads in the OS, file system, and application can limit performance. As a result, fully exploiting these memories' potential will require substantial changes to application and system software. Finally, paging to fast non-volatile memories is a viable option for some applications, providing an alternative to expensive, powerhungry DRAM for supporting scientific applications with large memory footprints.

show abstract

Section: Raid-ssdmentioning

confidence: 55%

Understanding the Impact of Emerging Non-Volatile Memories on High-Performance, IO-Intensive Computing

Caulfield

Coburn

Mollov

et al. 2010

2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis

Self Cite

116

View full text Add to dashboard Cite

show abstract

“…• Dash [7], a cluster targeted for data-intensive computing at SDSC, consisting of 32 nodes each with dual quad-core 2.4 GHz Intel Nehalem processors and 48 GB of RAM per node. Half of Dash is configured as a traditional 16-node cluster but with a single 64 GB flash drive in each node.…”

Section: Resultsmentioning

confidence: 99%

Parallel high-resolution climate data analysis using swift

Woitaszek

Dennis

Sines

2011

Proceedings of the 2011 ACM International Workshop on Many Task Computing on Grids and Supercomputers

View full text Add to dashboard Cite

Advances in software parallelism and high-performance systems have resulted in an order of magnitude increase in the volume of output data produced by the Community Earth System Model (CESM). As the volume of data produced by CESM increases, the single-threaded script-based software packages traditionally used to post-process model output data have become a bottleneck in the analysis process. This paper presents a parallel version of the CESM atmosphere model data analysis workflow implemented using the Swift scripting language.Using the Swift implementation of the workflow, the time to analyze a 10-year atmosphere simulation on a typical cluster is reduced from 95 to 32 minutes on a single 8-core node and to 20 minutes on two nodes. The parallelized workflow is then used to evaluate several new data-intensive computational systems that feature RAM-based and flashbased storage. Even when constraining parallelism to limit the amount of file system space used by intermediate temporary data, our results show that the Swift-based implementation significantly reduces data analysis time.

show abstract

“…Moving to the field, several research laboratories are making efforts to improve the performance of their supercomputers with the SSD-based storage. For instance, San Diego Supercomputing Center deployed SSDs in their supercomputer, Gordon [12,34,42], to reduce the latency gap between the memory and disks, as has the Tokyo Institute of Technology for their supercomputer, TSUBAME [27].…”

Section: Related Workmentioning

confidence: 99%

Exploring the Future of Out-of-Core Computing with Compute-Local Non-Volatile Memory

Wilson

Choi

Shalf

et al. 2014

Scientific Programming

View full text Add to dashboard Cite

Abstract. Drawing parallels to the rise of general purpose graphical processing units (GPGPUs) as accelerators for specific high-performance computing (HPC) workloads, there is a rise in the use of non-volatile memory (NVM) as accelerators for I/O-intensive scientific applications. However, existing works have explored use of NVM within dedicated I/O nodes, which are distant from the compute nodes that actually need such acceleration. As NVM bandwidth begins to out-pace point-to-point network capacity, we argue for the need to break from the archetype of completely separated storage.Therefore, in this work we investigate co-location of NVM and compute by varying I/O interfaces, file systems, types of NVM, and both current and future SSD architectures, uncovering numerous bottlenecks implicit in these various levels in the I/O stack. We present novel hardware and software solutions, including the new Unified File System (UFS), to enable fuller utilization of the new compute-local NVM storage. Our experimental evaluation, which employs a real-world Out-of-Core (OoC) HPC application, demonstrates throughput increases in excess of an order of magnitude over current approaches.

show abstract

DASH: a Recipe for a Flash-based Data Intensive Supercomputer

Cited by 37 publications

References 12 publications

Understanding the Impact of Emerging Non-Volatile Memories on High-Performance, IO-Intensive Computing

Understanding the Impact of Emerging Non-Volatile Memories on High-Performance, IO-Intensive Computing

Parallel high-resolution climate data analysis using swift

Exploring the Future of Out-of-Core Computing with Compute-Local Non-Volatile Memory

Contact Info

Product

Resources

About