2022
DOI: 10.1109/tsg.2021.3125275
|View full text |Cite
|
Sign up to set email alerts
|

Power Modeling for Effective Datacenter Planning and Compute Management

Abstract: Over the past decade, there has been a global growth in datacenter capacity, power consumption and the associated costs. Accurate mapping of datacenter resource usage (CPU, RAM, etc.) and hardware configurations (servers, accelerators, etc.) to its power consumption is necessary for efficient long-term infrastructure planning and real-time compute load management. This paper presents two types of statistical power models that relate CPU usage of Google's Power Distribution Units (PDUs, commonly referred to as … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
27
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 17 publications
(27 citation statements)
references
References 22 publications
0
27
0
Order By: Relevance
“…Our estimate of this upper bound, used for resource reservations, is greater than usage with high probability (close to 1). In prior work [20], we demonstrated that the power usage of a cluster power domain can be estimated within 5% error using a piecewise linear function of its CPU usage. As a result, any change in cluster-level CPU usage can be accurately mapped into a change in its power usage, which is critical for building CICS (see Subsection III-A for more details).…”
Section: B Google's Real-time Resource Management and Its Reliability...mentioning
confidence: 97%
See 4 more Smart Citations
“…Our estimate of this upper bound, used for resource reservations, is greater than usage with high probability (close to 1). In prior work [20], we demonstrated that the power usage of a cluster power domain can be estimated within 5% error using a piecewise linear function of its CPU usage. As a result, any change in cluster-level CPU usage can be accurately mapped into a change in its power usage, which is critical for building CICS (see Subsection III-A for more details).…”
Section: B Google's Real-time Resource Management and Its Reliability...mentioning
confidence: 97%
“…One item that differentiates this new methodology from previous approaches is its inclusion of risk associated with application and infrastructure performance expectations. Special attention is given to (i) predicting the next day's flexible and inflexible compute usage, (ii) translating it to power consumption [20], and (iii) then optimizing in a risk-aware manner. To meet Google's infrastructure and application SLOs, there is monitoring, performance tracking, and a feedback loop that evaluates the recent application level impact and controls load shaping accordingly.…”
Section: A Related Researchmentioning
confidence: 99%
See 3 more Smart Citations