A novel scalable and stackable nonvolatile memory technology suitable for building fast and dense memory devices is discussed. The memory cell is built by layering a storage element and a selector. The storage element is a Phase Change Memory (PCM) cell [1] and the selector is an Ovonic Threshold Switch (OTS) [2]. The vertically integrated memory cell of one PCM and one OTS (PCMS) is embedded in a true cross point array. Arrays are stacked on top of CMOS circuits for decoding, sensing and logic functions. A RESET speed of 9 nsec and endurance of 10 6 cycles are achieved.One volt of dynamic range delineating SET vs. RESET is also demonstrated.
Intrinsic performance capability of a logic technology sometimes exceeds the frequency requirements of ultra-low-energy processors. For these applications, Vcc is typically reduced to improve energy efficiency while meeting the relaxed performance targets. However, amounts of Vcc reduction and energy improvements achievable by conventional schemes are limited by (1) leakage power becoming a larger fraction of total power at low Vcc, (2) severe delay degradation below a certain Vcc, and (3) circuit failure at very low Vcc.We evaluate effectiveness of a low-voltage swapped-body (LVSB) biasing technique in alleviating the limits to Vcc reduction and energy improvements achievable by conventional schemes. LVSB is implemented on test chips fabricated in 180nm and 130nm logic technologies, and on a TCP offload accelerator processor [1] in 90nm logic technology (Fig. 8.4.7). In the LVSB configuration (Figs. 8.4.1 and 8.4.2), conventional body connections of NMOS and PMOS devices, used in designs with no body bias (NBB), are swapped. PMOS bodies are connected to ground, instead of Vcc, and NMOS bodies are connected to Vcc, instead of ground. As a result, all devices receive an amount of forward body bias equal to Vcc. Conventional body biasing schemes, both in forward (FBB) and reverse (RBB) directions, apply a constant body-to-source voltage to all devices, independent of Vcc [2-4]. As a result, they incur some area overhead for bias generators and body bias grid routing. Area overhead for LVSB technique, is negligible, similar to NBB designs. LVSB can be used dynamically where the body connections remain swapped during active mode and are reversed to the conventional NBB configuration during standby.The test chip (Fig. 8.4.1) contains different static circuit chains implemented in NBB and LVSB configurations, and is designed to enable delay and power measurement of each chain independently. The TCP processor chip (Fig. 8.4.2) contains on-chip PMOS bias generators. These on-chip bias generators are used for adaptive body biasing (ABB) to improve frequency binning in high performance applications [2]. Body bias to both PMOS and NMOS devices can also be applied from an off-chip power supply. A more detailed description of the processor core is reported in [1].Power and frequency of three NBB and LVSB circuit chains are measured for Vcc values ranging from 0.25-1.8V and a temperature range of 25-75 O C. In addition, the minimum Vcc (Vccmin) at which circuit functionality is preserved, even at very low frequencies, is measured. Maximum frequency of operation (Fmax) and total active power of the TCP processor chip are measured in both NBB and LVSB configurations across a range of Vcc values at 25 O C and 75 O C. Standby leakage powers are measured for NBB, as well as static & dynamic LVSB.Frequencies of LVSB circuit chains in 180nm and 130nm technologies are higher than NBB chains below 0.8-0.9V (Fig. 8.4.3). For LVSB, the rate of frequency improvement with Vcc begins to saturate beyond 0.5-0.6V Vcc as the amount of (V T ) r...
Clock frequency of a multi-ported, 256X32h dynamic register file in a lOOnm technology is improved by 50%, compared to the best dual-V, (DVT) design, using LBSF and SFN leakage-tolerant circuit techniques for LBL and GBL. Total transistor width of the full LBSF design is the smallest.High performance microprocessor execution cores require multi-ported register tiles (RF) with single-cycle readlwrite. High fan-in or wide dynamic circuits are typically used for local (LBL) and global (GBL) bitlines to meet the aggressive delay and area requirements. However, noise tolerance of wide domino gates degrades rapidly with technology scaling as transistor subthreshold leakage increases exponentially [I].Although circuit robustness can be recovered by upsizing the keeper at the dynamic node and by reducing skew of the static stage, the resulting performance loss is too large. Conditional keeper schemes [2], pseudo-static local bitlines [3] and a fully static dual-VT register tile [4] have been reported previously for improving performance and leakage tolerance of wide domino gates and register files. In this paper, we propose two techniques, Leakage Bypass with Stack Forcing (LBSF) and Source Follower NMOS (SFN), to improve speed and robustness of wide dynamic circuits. We evaluate the effectiveness of using LBSF and SFN in leakage-sensitive parts of a 256X32h multi-ported dynamic register tile design in a lOOnm dual-VT technology. Delay, energy, total transistor width and leakage tolerance are compared with a conventional dual-VT (DVT) design.Dynamic RF hitcells, driving the 16-wide LBL, and a domino mux, driving the 8-wide GBL, are the key leakage-sensitive and performance-critical circuits in the 256X32b 4-read, 4-write ported register file with single-ended read-select and bitline signaling (Fig. 1). The 1000pm long GBL in M4 spans the lengths of all 8 hanks in the array, and thus presents a large interconnect load to the domino mux. Skewed static stages, that merge two LBL's per bank, drive the mux inputs. A 2@ clocking scheme is used with full time borrowing at the boundary. A 8-bit address per readlwrite port is decoded in the previous cycle to generate the readlwrite select signal at the '3, edge. Read delay from @, edge to data at output of the static inverter driven by GBL determines clock cycle time.Noise immunities of the wide dynamic circuits used for driving LBL and GBL degrade as VT is lowered because of excessive leakage in the hitline pull-down devices and worsening trip point of the static stage. In DVT designs, leakage-sensitive devices are made high-VT to achieve a DC noise robustness of at least 10% of supply voltage (Vcd with minimal performance impact (Fig. Ib). Hence, they cannot take advantage of the large drive currents available from low-V, devices in a dual-V, technology. In the LBSF scheme, the LBL pull-down device MO (Fig. Za) in the hitcell is replaced by two series-connected devices M I and M2 (Fig. 2b) to force a 2-stack in the bitline leakage path through the keeper, regardless of the valu...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.