2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) 2021
DOI: 10.1109/ispass51385.2021.00034
|View full text |Cite
|
Sign up to set email alerts
|

Re-establishing Fetch-Directed Instruction Prefetching: An Industry Perspective

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
19
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 22 publications
(19 citation statements)
references
References 27 publications
0
19
0
Order By: Relevance
“…Simulation parameters. We simulate and evaluate Thermometer using the ChampSim [5] simulator and adjust simulation parameters to resemble a recent state-of-the-art industry FDIP baseline [57,58], as listed in Table 1. We implement the optimal BTB replacement policy (Belady's algorithm [29,61]) and other existing policies including SRRIP [62], GHRP [20], and Hawkeye [60] to compare them with Thermometer.…”
Section: Experimental Methodologymentioning
confidence: 99%
See 2 more Smart Citations
“…Simulation parameters. We simulate and evaluate Thermometer using the ChampSim [5] simulator and adjust simulation parameters to resemble a recent state-of-the-art industry FDIP baseline [57,58], as listed in Table 1. We implement the optimal BTB replacement policy (Belady's algorithm [29,61]) and other existing policies including SRRIP [62], GHRP [20], and Hawkeye [60] to compare them with Thermometer.…”
Section: Experimental Methodologymentioning
confidence: 99%
“…Using these traces, we characterize BTB replacement challenges to design Thermometer, a novel profileguided BTB replacement technique. We validate Thermometer's effectiveness on 13 data center applications and on CBP-5 [15] and IPC-1 [17] traces that prior work [20,24,57,58] evaluate their frontend optimizations.…”
Section: Experimental Methodologymentioning
confidence: 99%
See 1 more Smart Citation
“…The request for cache block containing these instructions is sent to the L1I, effectively hiding the latency for future instructions as multiple L1I requests are going in parallel. Recent work [22] also shows that L1I prefetchers like EIP [23] and FNL+MMA [24] do not provide significant performance improvement with a decoupled front-end and FDIP prefetcher.…”
Section: Motivationmentioning
confidence: 99%
“…A recent work [22] has shown that with a decoupled frontend, an FDIP prefetcher, and the ideal BTB, the ideal L1I does not provide a significant performance improvement. We also observe the same in Figure 1 where an ideal front-end provides 3% more improvement compared to the ideal BTB.…”
Section: Introductionmentioning
confidence: 99%