2008
DOI: 10.1109/hpca.2008.4658652
|View full text |Cite
|
Sign up to set email alerts
|

An OS-based alternative to full hardware coherence on tiled CMPs

Abstract: The interconnect mechanisms (shared bus or crossbar) used in current chip-multiprocessors (CMPs) are expected to become a bottleneck that prevents these architectures from scaling to a larger number of cores. Tiled CMPs offer better scalability by integrating relatively simple cores with a lightweight point-to-point interconnect. However, such interconnects make snooping impractical and, thus, require alternative solutions to cache coherence. This paper proposes a novel, cost-effective mechanism to support sha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
40
0

Year Published

2010
2010
2023
2023

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 40 publications
(40 citation statements)
references
References 36 publications
0
40
0
Order By: Relevance
“…Under the remote-access framework of standard NUCA designs [7,9], all non-local memory accesses cause a request to be transmitted over the interconnect, the access to be performed in the remote core, and the data (for loads) or acknowledgement (for writes) to be sent back to the requesting core. When a core C executes a memory access for address A, it must first find the home core H for A (e.g., by consulting a mapping table or masking some address bits).…”
Section: A Remote Cache Accessmentioning
confidence: 99%
See 1 more Smart Citation
“…Under the remote-access framework of standard NUCA designs [7,9], all non-local memory accesses cause a request to be transmitted over the interconnect, the access to be performed in the remote core, and the data (for loads) or acknowledgement (for writes) to be sent back to the requesting core. When a core C executes a memory access for address A, it must first find the home core H for A (e.g., by consulting a mapping table or masking some address bits).…”
Section: A Remote Cache Accessmentioning
confidence: 99%
“…A straightforward approach to removing directories while maintaining cache coherence is to disallow cache line replication across on-chip caches (even L1 caches) and use remote word-level access to load and store remotely cached data [7]: in this scheme, every access to an address cached on a remote core becomes a two-message round trip. Since only one copy is ever cached, however, coherence is trivially ensured.…”
Section: Introductionmentioning
confidence: 99%
“…NUCA architectures divide the address space among the cores in such a way that each address is assigned to a unique home core where the corresponding data can be cached [7], [5]. To read and write data cached in a remote core, the NUCA architectures proposed so far use a remote access mechanism where a request is sent to the home core and the resulting data (or acknowledgement) is sent back to the requesting core.…”
Section: Memory Access Frameworkmentioning
confidence: 99%
“…Under the remote-access framework of standard NUCA designs [7], [5], all non-local memory accesses cause a request to be transmitted over the interconnect network, the access to be performed in the remote core, and the data (for loads) or acknowledgement (for writes) to be sent back to the requesting core. When a core C executes a memory access for address A, it must first find the home core H for A (e.g., by consulting a mapping table or masking some address bits).…”
Section: Remote Cache Accessmentioning
confidence: 99%
See 1 more Smart Citation