Optimizing energy and performance for server-class file system workloads

Sehgal, Priya; Tarasov, Vasily; Zadok, Erez

doi:10.1145/1837915.1837918

Cited by 13 publications

(12 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Fei et al carried out a study of energy-saving/performance trade-offs in the context of Grid resource management [21], by applying and extending the theory of M/G/1 queues with vacations. Factors influencing energy and performance in data center storage systems (including deployed file systems) were studied in [17] while the disruptive impact of SSD technology on the energy and performance required to process database queries is considered in [16]. It was concluded that it is necessary to completely reassess traditional rules of thumb related to circumstances under which it is appropriate to use different kinds of table scans (i.e.…”

Section: Related Workmentioning

confidence: 99%

Energy--Performance Trade-Offs via the EP Queue

Harrison

Patel

Knottenbelt

2016

ACM Trans. Model. Perform. Eval. Comput. Syst.

View full text Add to dashboard Cite

We introduce the EP-queue -a significant generalization of the M B /G/1 queue that has state-dependent service time probability distributions and incorporates power-up for first arrivals and power-down for idle periods. We derive exact results for the busy time and response time distributions. From these we derive power consumption metrics during non-idle periods and overall response time metrics, which together provide a single measure of the trade-off between energy and performance. We illustrate these trade-offs for some policies and show how numerical results can provide insights into system behavior. The EP-queue has application to storage systems, especially hard disks, and other data center components such as compute servers, networking and even hyper-converged infrastructure. Introduction Trading off power and response time in modern computing infrastructuresModern software-defined data centers are typically organized in a hierarchical fashion, with multiple tiers of cache present within and across components ranging from processors to network and storage components. Advances in caching technologies (e.g. host-side flash caches) mean that much read traffic is being absorbed before it reaches the lower tiers [11,15]. Thus the workloads of lower tiers are increasingly dominated by writes which can be potentially buffered in higher tiers. Nevertheless, occasional read bursts still occur due to the reading of as-yet-uncached data.This burstiness of access creates the possibility of using components that can be turned off during idle periods and restarted when new work arrives, thus saving energy while delivering acceptable access latencies. Efficient usage of compute, network and storage resources is important in light of the fact that data centers are collectively storing about 35-50% more data per year [1] and consuming more than one percent of global electricity [9]. Much progress has been made on the server side with application consolidation via server virtualization and shared networked storage.However, further energy optimizations will be required as long as idle sub-components consume energy.In a system with components that shut down to save power, we need to understand the expected busy period durations to determine power savings. Similarly, we need to incorporate startup (POWER UP) and shutdown (POWER DOWN) times as random variables to model the impact on both the power and response time characteristics. As one would expect, the first arrival to an idle server will incur a significant delay but subsequent arrivals will likely fare much better. We would like to characterize the impact of this kind of response time behavior on host-side applications. It is in this context that we introduce and analyze the Energy-Performance (EP) queue -a significant generalization of the M B /G/1 queue that has state-dependent service time probability distributions and incorporates power-up for first arrivals and power-down for idle periods. The analytical model developed compares two basic policies: the two-...

show abstract

Section: Related Workmentioning

confidence: 99%

Energy--Performance Trade-Offs via the EP Queue

Harrison

Patel

Knottenbelt

2016

ACM Trans. Model. Perform. Eval. Comput. Syst.

View full text Add to dashboard Cite

show abstract

“…Measuring the effect of changes to individual system parameters (such as the bandwidth of the disk controllers or the number of disks in the system) within this consistent framework enables the effect of these parameters within a complex system to be better understood. However, often only a fixed set of components are used (see, for example, [13], [17]) and few studies have measured how individual parameter changes affect overall system metrics. Without rigorous benchmarking across a range of possibilities for a single system parameter only rough estimates of the change in system metrics can be made and these estimates are highly reliable on vendor specifications.…”

Section: Measurement Of Archival Storagementioning

confidence: 99%

“…Studies in [6], [14], [18], [23] examined various optimal disk spin up/down time-out selections, data flushing, file-grouping and reliability to maximize the energy conservation while minimizing the impact on performance. [17] measured how different file systems (e.g., ext2/3, reserfs and xfs) have performance and power consumptions on two servers, finding that different file systems can be suited for different workloads. A system-wide energy consumption model in [13] measures individual components of the CPU, DRAM, HDD, fan and system board, to combine them for benchmarking and predicting overall energy consumption.…”

Section: Related Workmentioning

confidence: 99%

Measurement for Improving the Design of Commodity Archival Storage Tiers

Lee

O'Sullivan

Walker

2011

2011 Fourth IEEE International Conference on Utility and Cloud Computing

View full text Add to dashboard Cite

Archival data storage plays a critical role in data preservation as almost all current data will eventually be archived. In addition, the demands placed on archival storage tiers are growing because of large regularly-scheduled backups. Archival storage tiers usually consist of tape-based devices with a large storage capacity, but limited I/O performance for retrieving data, especially when multiple retrieval requests are made simultaneously. The cost of disk-based devices continues to decrease while the capacity of individual disks increases so that diskbased systems are a realistic option for enterprise archival storage tiers.Optimization approaches can design archival storage systems with the best mix of small, low-cost machines and larger, expensive machines, but only if various metrics of the candidate machines are well-understood. This paper investigates the measurement of different classes of enterprise servers when utilized by a distributed file system. Our study primarily concerns the possible use of these servers within a disk-based archival storage system and produces measurements suitable for immediate use in the optimization-driven design of archival storage. Observing patterns from these measurements also enables us to predict metrics for other enterprise servers and then incorporate these alternative servers in the design process. We combine our measurements and predictions with an optimization engine to discover an ideal building block for a 500TB archival storage system.

show abstract

“…Most of the discussions about tail packing focused on the performance evaluation [16,18]; Sehgal et al discussed the influence of power consumption in a storage device [17], and Lu et al presented the impact of the lifetime of flash-based storage caused by the tailpacking solution [11] . Only a few prior works proposed solutions for increasing space utilization in file systems [4,14,15,21].…”

Section: Introductionmentioning

confidence: 99%

Dynamic tail packing to optimize space utilization of file systems in embedded computing systems

Hsu

Chen

Chang

et al. 2014

2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications

View full text Add to dashboard Cite

Embedded computing systems usually have limited computing power, RAM space, and storage capacity due to the consideration of their cost, energy consumption, and physical size. Some of them such as sensor nodes and embedded consumer electronics only have a small-sized flash memory as their storage with a (simple) file system to manage their data, which are usually of small sizes. However, the existing file systems usually have low space utilization on managing small files and the tail data of large files. In this work, we propose a dynamic tail packing scheme to optimize the space utilization of file systems by dynamically aggregating/packing the tail data of (small) files together. The proposed scheme was implemented in the file system of Linux operating systems to evaluate its capability. The results demonstrate that the proposed scheme could significantly improve the space utilization of existing file systems.

show abstract

Optimizing energy and performance for server-class file system workloads

Cited by 13 publications

References 28 publications

Energy--Performance Trade-Offs via the EP Queue

Energy--Performance Trade-Offs via the EP Queue

Measurement for Improving the Design of Commodity Archival Storage Tiers

Dynamic tail packing to optimize space utilization of file systems in embedded computing systems

Contact Info

Product

Resources

About