Abstract-Convergence of communication, consumer applications and computing within mobile systems pushes memory requirements both in terms of size, bandwidth and power consumption. The existing solution for the memory bottleneck is to increase the amount of on-chip memory. However, this solution is becoming prohibitively expensive, allowing 3D stacked DRAM to become an interesting alternative for mobile applications. In this paper, we examine the power/performance benefits for three different 3D stacked DRAM scenarios. Our high-level memory and Through Silicon Via (TSV) models have been calibrated on state-of-theart industrial processes. We model the integration of a logic die with TSVs on top of both an existing DRAM and a DRAM with redesigned transceivers for 3D. Finally, we take advantage of the interconnect density enabled by 3D technology to analyze an ultra-wide memory interface. Experimental results confirm that TSV-based 3D integration is a promising technology option for future mobile applications, and that its full potential can be unleashed by jointly optimizing memory architecture and interface logic.
This paper starts with a brief introduction to the UML 2.0 and application-specific UML customizations via profiles. After a discussion of UML design tools with focus on EDA support, we present a HW/SW co-design approach and demonstrate how HW architectures are described together with application SW in a unique UML based environment. Using a dedicated profile providing support for SystemC in UML, and a SystemC wrapper for the SimIt instruction set simulator of a StrongARM, an executable model of the complete architecture is generated which can be simulated by the SystemC kernel. The physical layer of an 802.11a system is used as an application example.
This paper presents a DRAM architecture that improves the DRAM performance/power trade-off to increase their usability on low power chip design using 3D interconnect technology. The use of a finer matrix subdivision and buffering the bitline signal at the localblock level allows to reduce both the energy per access and the access time. The obtained performances match those of a typical low power SRAM, while achieving a significant area and static power reduction compared to these memories.The 128 kb memory architecture proposed here achieves an access time of 1.3 ns for a dynamic energy of less than 0.2 pJ per bit. A localized refresh mechanism allows gaining a factor of 10 in static power consumption associated with the cell, and a factor of 2 in area, when compared with an equivalent SRAM. I. CONTEXTAs feature size reduces, on-chip memory design is becoming more and more challenging. Reducing the typical dimensions and the supply voltage for SRAM memories degrades the cell stability [1]. The stability is degraded further by intradie variations which lead in addition to increased average power consumption. Several solutions have been investigated to reduce this issue, from changing the cell topology [2] [3] [4] to modifying the peripheral architecture [5]. However, these solutions increase the memory area and thus compromise scaling. Embedded DRAM (eDRAM) has been proposed for large memory arrays. eDRAM clock speed and access time have been improved to match the SRAM typical behavior [6]. However, using eDRAM requires to integrate more dense capacitors in the logic technology process, and thus needs costly additional process steps.3D interconnect enables the use of heterogeneous technologies on the same chip. 3D vias are typically smaller and have less parasitic capacitance than off-chip connections [7]. In addition, they can be spread across the chip. This reduces the routing energy, and increases the number of available connections between two stacked dies.These advantages allow to provide a better bandwidthenergy trade off for the routing between two stacked dies than between two packaged dies. A possible application of 3D interconnect is to separate the logic core of a system from the Fig. 1. Global architecture -WL/BL subdivision Local_Address Block_address Global_SA Mux GBL data_out LWL receiver Local SA 32x32 cells x16 x16 GWL memory it requires. Such systems have already been studied in [8] [9], with stacks of an SRAM matrix on top of a logic layer. It is also possible to stack DRAM on top of a logic layer.This solution offers numerous other advantages compared to packaged DRAM, including simpler inputs/outputs protocol, and can solve the terminations and clock synchronisation issues by using shorter connections. This allows using conventional DRAM instead of SRAM or embedded DRAM for the largest memories in SOC, bringing a higher density compared to SRAM, without the need to integrate dedicated capacitors in the logic process, as for eDRAM.However, traditional DRAM is outperformed by SRAM in several dom...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.