2018
DOI: 10.1145/3296957.3173171
|View full text |Cite
|
Sign up to set email alerts
|

In-Memory Data Parallel Processor

Abstract: Recent developments in Non-Volatile Memories (NVMs) have opened up a new horizon for in-memory computing. Despite the significant performance gain offered by computational NVMs, previous works have relied on manual mapping of specialized kernels to the memory arrays, making it infeasible to execute more general workloads. We combat this problem by proposing a programmable in-memory processor architecture and data-parallel programming framework. The efficiency of the proposed in-memory processor comes from two … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
14
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 44 publications
(16 citation statements)
references
References 37 publications
0
14
0
Order By: Relevance
“…• Fixed Point Dot Product (FiPDP) 13 . FiPDP is a classical dot-product 𝑆 = 𝑁 𝑖=1 𝐴 𝑖 × 𝐵 𝑖 , where two vectors are multiplied element-wise, and the result vector is summed.…”
Section: Analysis Of Real-life Examples Using the Bitlet Modelmentioning
confidence: 99%
See 2 more Smart Citations
“…• Fixed Point Dot Product (FiPDP) 13 . FiPDP is a classical dot-product 𝑆 = 𝑁 𝑖=1 𝐴 𝑖 × 𝐵 𝑖 , where two vectors are multiplied element-wise, and the result vector is summed.…”
Section: Analysis Of Real-life Examples Using the Bitlet Modelmentioning
confidence: 99%
“…Using a configuration of 𝑋 𝐵𝑠 = 4096 and 𝑅 = 1024 increases the PIM Pure (and combined PIM+CPU) throughput to about 100 GOPS, which is higher than the CPU Pure throughput of 31 GOPS stated above. 13 https://en.wikipedia.org/wiki/Dot_product…”
Section: Analysis Of Real-life Examples Using the Bitlet Modelmentioning
confidence: 99%
See 1 more Smart Citation
“…FlexFlow is another dataflow model dealing with parallel types mismatch between the computation and CNN workloads [31]. These works attempt to make the best advantages of computation parallelism, data reuse, and flexibility [5,16,34].…”
Section: Dataflowmentioning
confidence: 99%
“…Processing-in-Memory (PIM) is a promising paradigm for accelerating memory-bandwidth-bound workloads, which have low arithmetic intensity [34,[48][49][50][51][52][53][54][55][56][57][58]. The key idea of the PIM paradigm is to move computation close to (i.e., processing-near-memory) or even into the memory devices (i.e., processing-using-memory) where the data resides (i.e., caches [48,[59][60][61][62][63][64][65], DRAM [33,34,[49][50][51][52][53][54][55][56][57][58], stor-age [109][110][111][112][113][114][115][116][117]), eliminating the need to move the data to the processor and resulting in higher performance and lower energy consumption. Stencil computations are a prime candidate for acceleration using the PIM paradigm.…”
Section: Introductionmentioning
confidence: 99%