Enabling Realistic Fine-Grain Voltage Scaling with Reconfigurable Power Distribution Networks

Godycki, Waclaw; Torng, Christopher; Bukreyev, Ivan; Apsel, Alyssa; Batten, Christopher

doi:10.1109/micro.2014.52

Cited by 43 publications

(13 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Energy efficiency is, in fact, a major issue across the whole computing spectrum, and modern systems have been exploring alternative heterogeneous processors [4][5][6][7][8][9][10] and Dynamic Voltage and Frequency Scaling (DVFS) [11][12][13][14] to trade-off performance and energy consumption.…”

Section: Introductionmentioning

confidence: 99%

Hipster: Hybrid Task Manager for Latency-Critical Cloud Workloads

Nishtala

Carpenter

Petrucci

et al. 2017

2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)

View full text Add to dashboard Cite

In 2013, U.S. data centers accounted for 2.2% of the country's total electricity consumption, a figure that is projected to increase rapidly over the next decade. Many important workloads are interactive, and they demand strict levels of quality-of-service (QoS) to meet user expectations, making it challenging to reduce power consumption due to increasing performance demands.This paper introduces Hipster, a technique that combines heuristics and reinforcement learning to manage latency-critical workloads. Hipster's goal is to improve resource efficiency in data centers while respecting the QoS of the latency-critical workloads. Hipster achieves its goal by exploring heterogeneous multicores and dynamic voltage and frequency scaling (DVFS). To improve data center utilization and make best usage of the available resources, Hipster can dynamically assign remaining cores to batch workloads without violating the QoS constraints for the latency-critical workloads. We perform experiments using a 64-bit ARM big.LITTLE platform, and show that, compared to prior work, Hipster improves the QoS guarantee for Web-Search from 80% to 96%, and for Memcached from 92% to 99%, while reducing the energy consumption by up to 18%.

show abstract

Section: Introductionmentioning

confidence: 99%

Hipster: Hybrid Task Manager for Latency-Critical Cloud Workloads

Nishtala

Carpenter

Petrucci

et al. 2017

2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)

View full text Add to dashboard Cite

show abstract

“…High DVFS transition latencies limit Rubik's gains somewhat. We hope that this and other recent work that requires fast DVFS [12,16] will motivate the adoption of low-latency DVFS interfaces. We implement Rubik as described in Sec.…”

Section: Real-system Evaluationmentioning

confidence: 91%

“…While off-chip regulators can take tens to hundreds of microseconds to adjust voltage [27,39], recent techniques based on on-chip voltage regulators [5,12,27,44] have sub-µs delays (e.g., 500 ns on Haswell [5]). Rubik leverages these fast voltage transition times, updating voltage/frequency at sub-millisecond granularity to counter short-term load variations (Sec.…”

Section: Dynamic Power Managementmentioning

confidence: 99%

Rubik

Kasture

Bartolini

Beckmann

et al. 2015

Proceedings of the 48th International Symposium on Microarchitecture

117

View full text Add to dashboard Cite

Latency-critical workloads (e.g., web search), common in datacenters, require stable tail (e.g., 95th percentile) latencies of a few milliseconds. Servers running these workloads are kept lightly loaded to meet these stringent latency targets. This low utilization wastes billions of dollars in energy and equipment annually.Applying dynamic power management to latency-critical workloads is challenging. The fundamental issue is coping with their inherent short-term variability: requests arrive at unpredictable times and have variable lengths. Without knowledge of the future, prior techniques either adapt slowly and conservatively or rely on application-specific heuristics to maintain tail latency.We propose Rubik, a fine-grain DVFS scheme for latency-critical workloads. Rubik copes with variability through a novel, general, and efficient statistical performance model. This model allows Rubik to adjust frequencies at sub-millisecond granularity to save power while meeting the target tail latency. Rubik saves up to 66% of core power, widely outperforms prior techniques, and requires no application-specific tuning.Beyond saving core power, Rubik robustly adapts to sudden changes in load and system performance. We use this capability to design RubikColoc, a colocation scheme that uses Rubik to allow batch and latencycritical work to share hardware resources more aggressively than prior techniques. RubikColoc reduces datacenter power by up to 31% while using 41% fewer servers than a datacenter that segregates latency-critical and batch work, and achieves 100% core utilization.

show abstract

“…The Per-Core DVFS technique comes at the expensive of on-chip inductors and reduced regulator efficiency. Intel's TurboBoost [12] enables microsecond scale voltage transitions to allow, for example, [10] as a method to improve voltage transition times with the use of a configurable onchip switch-cap based voltage regulator. Finally, Short Stop [8] and Booster [9] use dual-rail voltage systems to enable fine-grained boosting.…”

Section: Quick V/f Boosting: An Enabling Technologymentioning

confidence: 99%

“…Considering that the query latency for many OLDI services is in the range of milliseconds and microseconds [2,5], the emerging class of fine-grain (10s of nanoseconds) voltage boosting (i.e., quick boosting) techniques [6][7][8][9][10] has the potential to enable precise query-level boosting approaches. Given an energy budget, an intelligent quick boosting strategy could precisely pinpoint and boost queries that contribute to the tail as well as those whose latency is more likely to benefit from frequency/voltage boosting.…”

Section: Introductionmentioning

confidence: 99%

Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting

Hsu

Zhang

Laurenzano

et al. 2015

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

View full text Add to dashboard Cite

Reducing the long tail of the query latency distribution in modern warehouse scale computers is critical for improving performance and quality of service of workloads such as Web Search and Memcached. Traditional turbo boost increases a processor's voltage and frequency during a coarsegrain sliding window, boosting all queries that are processed during that window. However, the inability of such a technique to pinpoint tail queries for boosting limits its tail reduction benefit.In this work, we propose Adrenaline, an approach to leverage finer granularity, 10's of nanoseconds, voltage boosting to effectively rein in the tail latency with query-level precision. Two key insights underlie this work. First, emerging finer granularity voltage/frequency boosting is an enabling mechanism for intelligent allocation of the power budget to precisely boost only the queries that contribute to the tail latency; and second, per-query characteristics can be used to design indicators for proactively pinpointing these queries, triggering boosting accordingly. Based on these insights, Adrenaline effectively pinpoints and boosts queries that are likely to increase the tail distribution and can reap more benefit from the voltage/frequency boost. By evaluating under various workload configurations, we demonstrate the effectiveness of our methodology. We achieve up to a 2.50x tail latency improvement for Memcached and up to a 3.03x for Web Search over coarse-grained DVFS given a fixed boosting power budget. When optimizing for energy reduction, Adrenaline achieves up to a 1.81x improvement for Memcached and up to a 1.99x for Web Search over coarse-grained DVFS.

show abstract

Enabling Realistic Fine-Grain Voltage Scaling with Reconfigurable Power Distribution Networks

Cited by 43 publications

References 45 publications

Hipster: Hybrid Task Manager for Latency-Critical Cloud Workloads

Hipster: Hybrid Task Manager for Latency-Critical Cloud Workloads

Rubik

Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting

Contact Info

Product

Resources

About