2020
DOI: 10.1109/tc.2020.2972528
|View full text |Cite
|
Sign up to set email alerts
|

BLADE: An in-Cache Computing Architecture for Edge Devices

Abstract: Area and power constrained edge devices are increasingly utilized to perform compute intensive workloads, necessitating increasingly area and power efficient accelerators. In this context, in-SRAM computing performs hundreds of parallel operations on spatially local data common in many emerging workloads, while reducing power consumption due to data movement. However, in-SRAM computing faces many challenges, including integration into the existing architecture, arithmetic operation support, data corruption at … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
25
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 61 publications
(36 citation statements)
references
References 50 publications
0
25
0
Order By: Relevance
“…PrIM is opensource and publicly available at [168]. Unlike these prior works, DAMOV is applicable to and can be used to study other PIM architectures than processing-in/-near DRAM, including processing-in/-near cache [68,[93][94][95][169][170][171], processing-in/-near storage [40,[172][173][174][175][176][177][178][179][180][181], and processing-in/-near emerging NVMs [81,82,90,91,100,182,183]. This is possible since DAMOV's methodology and benchmarks are mainly concerned with broadly characterizing data movement bottlenecks in an application, independent of the underlying PIM architecture.…”
Section: Discussionmentioning
confidence: 99%
“…PrIM is opensource and publicly available at [168]. Unlike these prior works, DAMOV is applicable to and can be used to study other PIM architectures than processing-in/-near DRAM, including processing-in/-near cache [68,[93][94][95][169][170][171], processing-in/-near storage [40,[172][173][174][175][176][177][178][179][180][181], and processing-in/-near emerging NVMs [81,82,90,91,100,182,183]. This is possible since DAMOV's methodology and benchmarks are mainly concerned with broadly characterizing data movement bottlenecks in an application, independent of the underlying PIM architecture.…”
Section: Discussionmentioning
confidence: 99%
“…These signals can be further combined via a nor gate to achieve a xor operation. Finally, further processing allows complex operations such as addition and multiplication to be performed [55,56]. The operation results are then written back to the cache.…”
Section: Blade -In-cache Computingmentioning
confidence: 99%
“…Many approaches to Logic-In-Memory can be found in literature; however, two main approaches can be distinguished. The first one can be classified as Near-Memory Computing (NMC) [2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18], since the memory inner array is not modified and logic circuits are added at the periphery of this; the second one can be denoted as Logic-in-Memory (LiM) [19][20][21][22][23][24][25][26][27][28], since the memory cell is directly modified by adding logic circuits to it.…”
Section: Introductionmentioning
confidence: 99%
“…In an NMC architecture, logic and arithmetic circuits are added on the memory array periphery, in some cases exploiting 3D structures; therefore, the distance between computational and memory circuits is shortened, resulting in power saving and latency reduction for the data exchange between these. For instance: in [3], logic and arithmetic circuits are added on the bottom of an SRAM (Static Random Access Memory) array, where the data are transferred from different memory blocks, elaborated and, then, written back to the array; in [2], a DRAM (Dynamic Random Access Memory) is modified to perform logic bitwise operations on the bitlines, and the sense amplifiers are configured as programmable logic gates. Near-Memory Computing allows to maximise the memory density, with minimal modifications to the memory array itself, which is the most critical part of memory design; this results in a limited performance improvement with respect to computing systems based on conventional memories.…”
Section: Introductionmentioning
confidence: 99%