2017
DOI: 10.1109/lca.2016.2577557
|View full text |Cite
|
Sign up to set email alerts
|

LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory

Abstract: Processing-in-memory (PIM) architectures cannot use traditional approaches to cache coherence due to the high off-chip traffic consumed by coherence messages. We propose LazyPIM, a new hardware cache coherence mechanism designed specifically for PIM. LazyPIM uses a combination of speculative cache coherence and compressed coherence signatures to greatly reduce the overhead of keeping PIM coherent with the processor. We find that LazyPIM improves average performance across a range of PIM applications by 49.1% o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
132
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
4
1

Relationship

3
7

Authors

Journals

citations
Cited by 133 publications
(132 citation statements)
references
References 48 publications
0
132
0
Order By: Relevance
“…In a block group, the metadata block stores the sequence ID (SID), which is the unique number in the memory log area to represent a block group, and the metadata (BLK-1. Note that memory controllers are becoming increasingly more intelligent and complex to deal with various scheduling and performance management issues in multi-core and heterogeneous systems (e.g., [5], [6], [7], [8], [11], [12], [13], [14], [21], [25], [26], [27], [32], [33], [34], [35], [38], [39], [42], [45], [46], [49], [50], [51], [52], [53], [54], [61], [62], [64], [65], [66], [67], [68], [81], [84], [85], [86], [87], [88], [89], [97], [98], [108], [110], [112], [113],…”
Section: Eager Commitmentioning
confidence: 99%
“…In a block group, the metadata block stores the sequence ID (SID), which is the unique number in the memory log area to represent a block group, and the metadata (BLK-1. Note that memory controllers are becoming increasingly more intelligent and complex to deal with various scheduling and performance management issues in multi-core and heterogeneous systems (e.g., [5], [6], [7], [8], [11], [12], [13], [14], [21], [25], [26], [27], [32], [33], [34], [35], [38], [39], [42], [45], [46], [49], [50], [51], [52], [53], [54], [61], [62], [64], [65], [66], [67], [68], [81], [84], [85], [86], [87], [88], [89], [97], [98], [108], [110], [112], [113],…”
Section: Eager Commitmentioning
confidence: 99%
“…Although a typical PIM consists of a processing unit (PU), a DRAM controller, and at least one more DRAM, recent PIM proposals have not questioned the necessity of using a cache for PIM [5,6,8,9,10,11]. Existing cache architectures for PIM may be classified under two large groups, one inside of PIM [5,10,11] and one outside [6,8,9]. A host processor is the CPU of the system, and a PIM management unit (PMU) receives and passes packets for the operation of PIM from the host processor to the PIM subsystem.…”
Section: Cache Management Policies For Pimmentioning
confidence: 99%
“…As mentioned in Section 5.1, for the evaluation, we generate workloads with different cache miss ratios and marked them with "-H" for the original workload, "-M" for a medium miss ratio setting, and "-L" for a low miss ratio setting. 7 In the evaluation, we compare CAIRO with two naïve methods: (i) disabling offloading, denoted as the "no-offloading" method; and (ii) offloading all eligible candidates, 48:18 R. Hadidi et al denoted as the "all-offloading" method. We observe that while the "all-offloading" decision is beneficial for high miss ratio settings, it degrades performance for low miss ratio settings.…”
Section: Evaluation Of Cpu Workloadsmentioning
confidence: 99%