2022 IEEE International Solid- State Circuits Conference (ISSCC) 2022
DOI: 10.1109/isscc42614.2022.9731694
|View full text |Cite
|
Sign up to set email alerts
|

184QPS/W 64Mb/mm23D Logic-to-DRAM Hybrid Bonding with Process-Near-Memory Engine for Recommendation System

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 30 publications
(15 citation statements)
references
References 5 publications
0
15
0
Order By: Relevance
“…Additionally, an optional FP32 (application demands high precision) general matrix-multiplication engine (GEMM) [55] and an optional vector processing unit (VPU) [64] can be added to the design. Although FPGA's FP32 TFlops is not competitive with GPU or even CPU, GEMM/VPU might be useful in latency-sensitive inference tasks with simpler model, in which case data movement from FPGA to local or remote GPU can be eliminated.…”
Section: Access Enginementioning
confidence: 99%
“…Additionally, an optional FP32 (application demands high precision) general matrix-multiplication engine (GEMM) [55] and an optional vector processing unit (VPU) [64] can be added to the design. Although FPGA's FP32 TFlops is not competitive with GPU or even CPU, GEMM/VPU might be useful in latency-sensitive inference tasks with simpler model, in which case data movement from FPGA to local or remote GPU can be eliminated.…”
Section: Access Enginementioning
confidence: 99%
“…We anticipate consumer use-cases to continue diversifying, making a ordable-yet-exible DRAM increasingly important. Ambitious initiatives such as DRAM-system codesign [87,117,118,241,242] and emerging, non-traditional DRAM architectures [119,198,241,326,327,[357][358][359][360][361][362] will naturally engender transparency by tightening the relationship between DRAM manufacturers and system designers. Regardless of the underlying motivation, we believe that increased transparency of DRAM reliability characteristics will remain crucial to allowing system designers to make the best use of commodity DRAM chips by enabling them to customize DRAM chips for system-level goals.…”
Section: Alternative Futuresmentioning
confidence: 99%
“…Many works from academia [2, 10-12, 15-23, 25, 31, 35-39, 48, 81-83, 85, 86, 90, 99, 104-112] and industry [34,[41][42][43][50][51][52][53][54] have shown the benefits of PnM and PuM for a wide range of workloads from different domains. However, fully adopting PIM in commercial systems is still very challenging due to the lack of tools and system support for PIM architectures across the computer architecture stack [4], which includes: (i) workload characterization methodologies and benchmark suites targeting PIM architectures; (ii) frameworks that can facilitate the implementation of complex operations and algorithms using the underlying PIM primitives (e.g., simple PIM arithmetic operations [19], bulk bitwise Boolean in-DRAM operations [83,84,92]); (iii) compiler support and compiler optimizations targeting PIM architectures; (iv) operating system support for PIM-aware virtual memory, memory management, data allocation and mapping; and (v) efficient data coherence and consistency mechanisms.…”
Section: Motivation and Problemmentioning
confidence: 99%