2004
DOI: 10.1145/1028176.1006708
|View full text |Cite
|
Sign up to set email alerts
|

Microarchitecture Optimizations for Exploiting Memory-Level Parallelism

Abstract: The performance of memory-bound commercial applications such as databases is limited by increasing memory latencies. In this paper, we show that exploiting memory-level parallelism (MLP) is an effective approach for improving the performance of these applications and that microarchitecture has a profound impact on achievable MLP. Using the epoch model of MLP, we reason how traditional microarchitecture features such as out-oforder issue and state-of-the-art microarchitecture techniques such as runahead executi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
60
0

Year Published

2007
2007
2022
2022

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 119 publications
(68 citation statements)
references
References 29 publications
0
60
0
Order By: Relevance
“…Average MLP is needed to consider to estimate the average penalty of each cache miss. We derive the average MLP definition in [2] as the average number of useful outstanding off-chip requests when there is at least one outstanding off-chip requests, denoted by M LP avg . We have the following equation:…”
Section: Modeling Performance Impact Of Cache Sharingmentioning
confidence: 99%
See 1 more Smart Citation
“…Average MLP is needed to consider to estimate the average penalty of each cache miss. We derive the average MLP definition in [2] as the average number of useful outstanding off-chip requests when there is at least one outstanding off-chip requests, denoted by M LP avg . We have the following equation:…”
Section: Modeling Performance Impact Of Cache Sharingmentioning
confidence: 99%
“…By taking MLP related cost of each cache miss into account, [12] modified the standard LRU replacement policy and higher performance guaranteed. [2] analyzed the microarchitecture impact on MLP and developed a detailed model to relating MLP to overall performance.…”
Section: Related Workmentioning
confidence: 99%
“…The DBCP mechanism correlates data addresses to last-touch accesses, enabling DBCP to increase memory-level parallelism for dependent memory accesses. Modern out-of-order processors often stall, unable to overlap the latency of dependent L1D misses [5]. Correlating miss addresses with last-touch instruction traces enables the predictor to retrieve data-dependent blocks in parallel and enhances memory level parallelism for pointerdependent traversals (e.g., linked lists or trees).…”
Section: Background: Dbcp Prefetchingmentioning
confidence: 99%
“…Other work targeted at understanding, exploiting or optimizing runahead execution have been presented by Sorin et al [16] and Chou et al [9].…”
Section: Related Workmentioning
confidence: 99%