2016
DOI: 10.1145/2832911
|View full text |Cite
|
Sign up to set email alerts
|

Simultaneous Multi-Layer Access

Abstract: 3D-stacked DRAM alleviates the limited memory bandwidth bottleneck that exists in modern systems by leveraging through silicon vias (TSVs) to deliver higher external memory channel bandwidth. Today's systems, however, cannot fully utilize the higher bandwidth offered by TSVs, due to the limited internal bandwidth within each layer of the 3D-stacked DRAM. We identify that the bottleneck to enabling higher bandwidth in 3D-stacked DRAM is now the global bitline interface, the connection between the DRAM row buffe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
3
3
1

Relationship

2
5

Authors

Journals

citations
Cited by 105 publications
(11 citation statements)
references
References 56 publications
0
11
0
Order By: Relevance
“…This makes DRAM an increasingly significant system performance bottleneck today, especially for workloads with large footprints that are sensitive to DRAM access latency [12,77,111,112,. Therefore, there is significant opportunity for improving overall system performance by reducing the memory access latency [16,22,52,59,74,77,78,82,106,109,110,203,204]. Although conventional latencyhiding techniques (e.g., caching, prefetching, multithreading) can potentially help mitigate many of the performance concerns, these techniques (1) fundamentally do not change the latency of each memory access and (2) fail to work in many cases (e.g., irregular memory access patterns, random accesses, huge memory footprints).…”
Section: Slowdown Of Generational Improvementsmentioning
confidence: 99%
“…This makes DRAM an increasingly significant system performance bottleneck today, especially for workloads with large footprints that are sensitive to DRAM access latency [12,77,111,112,. Therefore, there is significant opportunity for improving overall system performance by reducing the memory access latency [16,22,52,59,74,77,78,82,106,109,110,203,204]. Although conventional latencyhiding techniques (e.g., caching, prefetching, multithreading) can potentially help mitigate many of the performance concerns, these techniques (1) fundamentally do not change the latency of each memory access and (2) fail to work in many cases (e.g., irregular memory access patterns, random accesses, huge memory footprints).…”
Section: Slowdown Of Generational Improvementsmentioning
confidence: 99%
“…Prior works show the applicability of different processing types in accelerating stencil computations: Near-Memory. PIMS [34] exploits the high-bandwidth provided by 3D-stacked memories (e.g., HMC [155], Hybrid Bandwidth Memory (HBM) [179,180]) to accelerate stencils. Casper, being a near-LLC accelerator, can be integrated with any commodity processor without the need of costly interfacing using through-silicon vias.…”
Section: Related Workmentioning
confidence: 99%
“…The novel 3D integration technique (Figure 5c) greatly promotes the fabrication of the IMC system. [75,76] IV. Architecture design and optimization: Reasonable architecture design and optimization are conductive to improving the efficiency of the whole computing system.…”
Section: Conclusion and Outlooksmentioning
confidence: 99%
“…The novel 3D integration technique (Figure 5c) greatly promotes the fabrication of the IMC system. [ 75,76 ] Architecture design and optimization: Reasonable architecture design and optimization are conductive to improving the efficiency of the whole computing system. Hence, it is necessary to design the system architecture according to the characteristics and requirements of the application.…”
Section: Conclusion and Outlooksmentioning
confidence: 99%
See 1 more Smart Citation