2012
DOI: 10.1145/2382553.2382557
|View full text |Cite
|
Sign up to set email alerts
|

Quantifying the Mismatch between Emerging Scale-Out Applications and Modern Processors

Abstract: Emerging scale-out workloads require extensive amounts of computational resources. However, data centers using modern server hardware face physical constraints in space and power, limiting further expansion and calling for improvements in the computational density per server and in the per-operation energy. Continuing to improve the computational resources of the cloud while staying within physical constraints mandates optimizing server efficiency to ensure that server hardware closely matches the needs of sca… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 28 publications
(7 citation statements)
references
References 24 publications
0
7
0
Order By: Relevance
“…Ferdman et al have demonstrated the mismatch between cloud workloads and modern out-of-order cores [11,12]. Through their detailed analysis of scale-out workloads on modern cores, they discovered several important characteristics of these workloads: 1) Scale-out workloads suffer from high instruction cache miss rates, and large instruction caches and pre-fetchers, are inadequate; 2) instruction and memory-level parallelism are low, thus leaving the advanced out-of-order core underutilized; 3) the working set sizes exceed the capacity of the on-chip caches; 4) bandwidth utilization of scale-out workloads is low.…”
Section: Characterizing Cloud Workloadsmentioning
confidence: 99%
“…Ferdman et al have demonstrated the mismatch between cloud workloads and modern out-of-order cores [11,12]. Through their detailed analysis of scale-out workloads on modern cores, they discovered several important characteristics of these workloads: 1) Scale-out workloads suffer from high instruction cache miss rates, and large instruction caches and pre-fetchers, are inadequate; 2) instruction and memory-level parallelism are low, thus leaving the advanced out-of-order core underutilized; 3) the working set sizes exceed the capacity of the on-chip caches; 4) bandwidth utilization of scale-out workloads is low.…”
Section: Characterizing Cloud Workloadsmentioning
confidence: 99%
“…Replacing server memory with lowerbandwidth mobile DRAM results in between zero and 1.55x performance degradation of workloads such as SPEC-CPU, PARSEC, and SPEC-OMP [33]. However, most cloud workloads severely underutilize the available memory bandwidth [17,33], even during peak times. Ferdman et al show that the per-core off-chip bandwidth utilization of map-reduce, media streaming, web front end, and web search is at most 25% of the available bandwidth.…”
Section: Cost Of Non-interleaved Address Mappingmentioning
confidence: 99%
“…As server workloads operate on a large volume of data, they produce active memory working sets that dwarf the capacity-limited on-chip caches of server processors and reside in the o -chip memory; hence, these applications frequently miss the data in the on-chip caches and access the long-latency memory to retrieve it. Such frequent data misses preclude server processors from reaching their peak performance because cores are idle waiting for the data to arrive [1,4,[12][13][14][15][16][17][18][19][20][21][22][23][24].…”
Section: Introductionmentioning
confidence: 99%