2014
DOI: 10.1145/2678373.2665718
|View full text |Cite
|
Sign up to set email alerts
|

Towards energy proportionality for large-scale latency-critical workloads

Abstract: Reducing the energy footprint of warehouse-scale computer (WSC) systems is key to their affordability, yet difficult to achieve in practice. The lack of energy proportionality of typical WSC hardware and the fact that important workloads (such as search) require all servers to remain up regardless of traffic intensity renders existing power management techniques ineffective at reducing WSC energy use. We present PEGASUS, a feedback-based controller that significantly improves the energy proportionali… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
110
0

Year Published

2015
2015
2020
2020

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 105 publications
(110 citation statements)
references
References 35 publications
0
110
0
Order By: Relevance
“…These workloads are architected in a high-fanout, multitiered configuration, with root nodes receiving user requests and farming them out to leaf nodes for processing. Thousands of leaf nodes may collaborate to serve each user request [9,17,32], and the latency perceived by the user is determined by the few slowest nodes, since the root node must wait for results from most or all leaf nodes to produce the final response. Thus, to ensure acceptable end-to-end latencies, the tail latencies (e.g., 95 th or 99 th percentile latencies) of leaf nodes should be small (e.g., a few milliseconds) and uniform across nodes.…”
Section: A Anatomy Of Latency-critical Applicationsmentioning
confidence: 99%
See 4 more Smart Citations
“…These workloads are architected in a high-fanout, multitiered configuration, with root nodes receiving user requests and farming them out to leaf nodes for processing. Thousands of leaf nodes may collaborate to serve each user request [9,17,32], and the latency perceived by the user is determined by the few slowest nodes, since the root node must wait for results from most or all leaf nodes to produce the final response. Thus, to ensure acceptable end-to-end latencies, the tail latencies (e.g., 95 th or 99 th percentile latencies) of leaf nodes should be small (e.g., a few milliseconds) and uniform across nodes.…”
Section: A Anatomy Of Latency-critical Applicationsmentioning
confidence: 99%
“…These techniques include new cluster managers that schedule and migrate applications across systems to reduce interference [18,32,36,54], fast dynamic voltage-frequency scaling (DVFS) techniques to improve power efficiency [25,29,32,48], hardware and software schemes to use low power idle states [37,39,53], and hardware resource partitioning schemes that allow batch workloads to run alongside latency-critical ones, improving utilization [29,30,33,57].…”
Section: A Anatomy Of Latency-critical Applicationsmentioning
confidence: 99%
See 3 more Smart Citations