2008 41st IEEE/ACM International Symposium on Microarchitecture 2008
DOI: 10.1109/micro.2008.4771779
|View full text |Cite
|
Sign up to set email alerts
|

Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs

Abstract: As the number of transistors integrated on a chip continues to increase, a growing challenge is accurately modeling per- 8%, 9.5% and 17.8%, respectively.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2010
2010
2015
2015

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 21 publications
(13 citation statements)
references
References 31 publications
0
13
0
Order By: Relevance
“…According to the formula 11, the FFT algorithm after adopting the mapping method doesn't improve a lot when the processing length is shorter than cache capacity. However if it is longer, the improvement is obvious [14] . To verify the effectiveness of this method, we will implement the radix-2…”
Section: Effective Fft Mapping Methods Based On Superscalar Processormentioning
confidence: 99%
See 1 more Smart Citation
“…According to the formula 11, the FFT algorithm after adopting the mapping method doesn't improve a lot when the processing length is shorter than cache capacity. However if it is longer, the improvement is obvious [14] . To verify the effectiveness of this method, we will implement the radix-2…”
Section: Effective Fft Mapping Methods Based On Superscalar Processormentioning
confidence: 99%
“…After data partitioning, the problem how to achieve data should be considered. In practical application, the data always be huge, thus it usually be stored in external storage space of DSP (usually SDRAM) [14] . Therefore we can use DMA of DSP processor to access data from SDRAM according to the partitioning method showed in formula 12, and remove it to on-chip memory.…”
Section: An Effective Mapping Methods For Fft Based On the Ts201mentioning
confidence: 99%
“…Karkhanis and Smith described a "first-order" performance model [11], which was later refined [6,2,5]. Instructions are (quickly) processed one by one to obtain certain statistics, like the CPI in the absence of miss events, the number of branch mispredictions, the number of non-overlapped long data cache misses, and so on.…”
Section: Structural Core Modelsmentioning
confidence: 99%
“…A practical use of the BADCO methodology may use sampling to obtain a representative set of traces [25]. 2 We used SimpleScalar EIO tracing feature [1], which is included in the Zesto simulation package. Other known methods for reproducible simulations include for instance System-Inria the same sequence of instructions.…”
Section: Trace Generationmentioning
confidence: 99%
“…Karkhanis and Smith [104] use the interval model to explore the processor design space automatically and identify processor configurations that represent Pareto-optimal design points with respect to performance, energy and chip area for a particular application or set of applications. Chen and Aamodt [27] extend the interval model by proposing ways to include hardware prefetching and account for a limited number of miss status handling registers (MSHRs). Hong and Kim [84] present a first-order model for GPUs which shares some commonalities with the interval model described here.…”
Section: Follow-on Workmentioning
confidence: 99%