2010
DOI: 10.1109/mm.2010.82
|View full text |Cite
|
Sign up to set email alerts
|

SARC Coherence: Scaling Directory Cache Coherence in Performance and Power

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
48
0

Year Published

2012
2012
2018
2018

Publication Types

Select...
3
3
2

Relationship

3
5

Authors

Journals

citations
Cited by 56 publications
(48 citation statements)
references
References 11 publications
0
48
0
Order By: Relevance
“…This will be driven heavily by the higher power consumption required with increased parallelism and the greater expense in time required to check the increased function unit count for cached copies of data. A number of existing studies provide a description of the challenges and costs associated with this exercise: Schuchhardt et al [14] and Kaxiras and Keramidas [8] provide quantitative evidence that cache coherency creates substantial additional on-chip traffic and suggest forms of hierarchical or dynamic directories to reduce traffic, but these approaches have limited scalability. Furthermore, Xu [20] finds that hierarchical caching doesn't improve the probability of finding a cache line locally as much as one would hope -a conclusion also supported by a completely independent study by Ros et al [12] that found conventional hardware coherence created too much long-distance communication (easily a problem for the scalability of future chips).…”
Section: Cache Locality/topologymentioning
confidence: 99%
“…This will be driven heavily by the higher power consumption required with increased parallelism and the greater expense in time required to check the increased function unit count for cached copies of data. A number of existing studies provide a description of the challenges and costs associated with this exercise: Schuchhardt et al [14] and Kaxiras and Keramidas [8] provide quantitative evidence that cache coherency creates substantial additional on-chip traffic and suggest forms of hierarchical or dynamic directories to reduce traffic, but these approaches have limited scalability. Furthermore, Xu [20] finds that hierarchical caching doesn't improve the probability of finding a cache line locally as much as one would hope -a conclusion also supported by a completely independent study by Ros et al [12] that found conventional hardware coherence created too much long-distance communication (easily a problem for the scalability of future chips).…”
Section: Cache Locality/topologymentioning
confidence: 99%
“…Self-invalidation was recently used by Kaxiras and Keramidas in their "SARC Coherence" proposal [19]. They observe that with self-invalidation, writer prediction becomes straightforward to implement.…”
Section: Self-invalidationmentioning
confidence: 99%
“…In terms of performance and power, complex protocols are characterized by a large number of broadcasts and snoops. Here too, significant effort has been expended to reduce or filter coherence traffic [19,25,34] with the intent of making complex protocols more power-or performance-efficient. Verification of such protocols is difficult and in many cases incomplete [1].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Relaxed consistency protocols: SC-for-DRF consistency protocols [1], [2], [6], [34], [35] rely on self-invalidation at synchronization points: Lebeck and Wood use self-invalidation to limit the number of cache blocks registered in the directory [6], SARC coherence [34] employs self-invalidation and implements a writer prediction to avoid the directory indirection upon downgrades. In DeNovo [1] a compiler inserts self-invalidating instructions based on source code annotations.…”
Section: Related Workmentioning
confidence: 99%