2007
DOI: 10.1109/tpds.2007.1091
|View full text |Cite
|
Sign up to set email alerts
|

A NUCA Substrate for Flexible CMP Cache Sharing

Abstract: We propose an organization for the on-chip memory system of a chip multiprocessor in which 16 processors share a 16-Mbyte pool of 64 level-2 (L2) cache banks. The L2 cache is organized as a nonuniform cache architecture (NUCA) array with a switched network embedded in it for high performance. We show that this organization can support a spectrum of degrees of sharing: unshared, in which each processor owns a private portion of the cache, thus reducing hit latency, and completely shared, in which every processo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
142
0
1

Year Published

2009
2009
2020
2020

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 155 publications
(144 citation statements)
references
References 31 publications
1
142
0
1
Order By: Relevance
“…This work, instead, focuses on the dynamic power consumption caused by bank accesses in a D-NUCA architecture, and aims to optimize the migration mechanism in order to make it more energy efficient without impacting on performance. Huh et al [8] proposed a CMP D-NUCA architecture and evidenced how two or more processors which share a cache line can generate migration patterns similar to the one presented in this work, but a study on the relevance of this phenomenon depending on the application and the architecture is not presented.…”
Section: Related Workmentioning
confidence: 57%
“…This work, instead, focuses on the dynamic power consumption caused by bank accesses in a D-NUCA architecture, and aims to optimize the migration mechanism in order to make it more energy efficient without impacting on performance. Huh et al [8] proposed a CMP D-NUCA architecture and evidenced how two or more processors which share a cache line can generate migration patterns similar to the one presented in this work, but a study on the relevance of this phenomenon depending on the application and the architecture is not presented.…”
Section: Related Workmentioning
confidence: 57%
“…L2 maintains the same associativity but is 256 kB. Finally, LLC configuration corresponds to a shared SNUCA [29] with 4 banks and a total capacity of 8 MB. L2 is exclusive with L1 (i.e., acts as a victim cache of L1) and LLC is inclusive with private caches [28].…”
Section: Experimental Methodologymentioning
confidence: 99%
“…[10] proposes a new thinking to use multi-core to sequential applications that spills cores. [11] does the research on flexible CMP cache sharing. It offers a structure in which L2 caches may be shared by all processors, may be separated into private per-processor partitions, or may be partitioned into separate caches, each shared by a subset of the processors.…”
Section: Related Workmentioning
confidence: 99%