2020
DOI: 10.1109/lca.2020.3011643
|View full text |Cite
|
Sign up to set email alerts
|

pPIM: A Programmable Processor-in-Memory Architecture With Precision-Scaling for Deep Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
9
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 29 publications
(9 citation statements)
references
References 9 publications
0
9
0
Order By: Relevance
“…Lookup table is a widely used method to improve runtime by replacing computation with memory lookup. There are many works (Deng et al, 2019;Sutradhar et al, 2020;Ferreira et al, 2021) try to accelerate deep neural networks with lookup tables by memorizing vector multiplication results. However, due to the huge lookup table size (GB+) required for memorizing all possible results of a vector-vector multiplication, all of them are DRAM based in-memory accelerators, hence they are not software solutions.…”
Section: Lookup Table Based Vector Multiplication Accelerationmentioning
confidence: 99%
“…Lookup table is a widely used method to improve runtime by replacing computation with memory lookup. There are many works (Deng et al, 2019;Sutradhar et al, 2020;Ferreira et al, 2021) try to accelerate deep neural networks with lookup tables by memorizing vector multiplication results. However, due to the huge lookup table size (GB+) required for memorizing all possible results of a vector-vector multiplication, all of them are DRAM based in-memory accelerators, hence they are not software solutions.…”
Section: Lookup Table Based Vector Multiplication Accelerationmentioning
confidence: 99%
“…Besides bitwise operations, DRAM PIM has been shown to significantly improve neural network computation inside memory. For example, by performing operations commonly found in convolutional network networks like the multiply-and-accumulate operation in memory, DRAM PIM can achieve significant speedup over conventional architectures [45]. operations such as addition and multiplication [1,23,43].…”
Section: Sing Drammentioning
confidence: 99%
“…Besides bitwise operations, DRAM PIM has been shown to significantly improve neural network computation inside memory. For example, by performing operations commonly found in convolutional network networks like the multiply-and-accumulate operation in memory, DRAM PIM can achieve significant speed-up over conventional architectures [45]. In order to extract even more performance improvements, [12] places single instruction, multiple data (SIMD) PEs adjacent to the sense amplifiers at the cost of higher area and power per bit of memory.…”
Section: Pim Using Drammentioning
confidence: 99%
“…As a result, PuM architectures can provide high compute throughput by performing operations in a bulk parallel manner, often at the granularity of memory rows. Prior PuM works [70,72,74,75,79,82,84,96,97] propose mechanisms for the execution of bulk bitwise operations (e.g., bitwise MAJority,AND,OR,NOT) [72, 74, 78, 80, 82-85, 87, 91, 98] and bulk arithmetic operations [70,75,79,96,97]. However, these proposals have two important limitations: 1) the execution of some complex operations (e.g., multiplication, division) incurs high latency and energy consumption [75], and 2) other complex operations (e.g., exponentiation, trigonometric functions) are not even supported.…”
Section: Introductionmentioning
confidence: 99%