E. Fetzer scite author profile

Multi-threaded microprocessors require multiple sets of register files to process concurrent instruction streams. This significantly complicates the register-file (RF) design and exacerbates reliability problems. This paper describes the dual-threaded, 18-port (8-read, 10-write), 128word × 82b floating-point register file (FRF), and the 22-port (12-read, 10-write), 128 × 65b integer register file (IRF) of the processor-code named Montecito [1]. A memory circuit is designed that consists of two storage cells, each bit accessible by two different threads. A charge-compensation technique is introduced to mitigate charge-sharing noise induced by threadswitch events. The dual-threaded implementation essentially doubles the number of memory cells and the RF becomes even more susceptible to errors caused by high-energy particles. A low-complexity parity-checking scheme is embedded into each register to provide soft error detection. The current design also integrates several power-saving features to achieve energy-efficiency and reliable operation.To support dual-threaded execution, each register bitcell incorporates two identical storage cells (Fig. 20.5.1). A control signal WRITEH is driven high during write operations, reducing the contention between the two cross-coupled inverters. Four transmission gates determine the thread selection, where each storage cell is exclusively accessible by either thread. Switching threads can induce charge-sharing noise between the storage cells b0/b1 and the internal bitlines ida/idb. Consider the worst-case scenario where the two storage cells contain different voltages. Each internal bitline will have the same voltage as the storage cell it is connected to. A thread switch is performed by flipping the thread signal. Each storage cell is then connected to the internal bitline having a different voltage value. Charge is redistributed momentarily between the storage cells and the internal bitlines. This may flip the storage cells and cause logic failures. To prevent this problem, two charge-compensation pFETs p0/p1 are introduced. The signal writel, connected to the sources of p0/p1, remains high during thread switch, thereby compensating for possible charge loss at b0/b1 through p0/p1 (Fig. 20.5.2). This technique does not induce additional contention during write operations when writel switches to virtual ground. Moreover, this technique improves write timing. A high voltage at the storage node (b0/b1) will be pulled down slightly by writel at the beginning of a write operation, making it easier to write a "0". For writing a "1", the chargecompensation pFETs p0/p1 help boost b0/b1 to VDD quickly at the end of a write when writel goes high.The register files are relatively large because of the dual-threaded implementation. This compounded with the high-density layout makes the RFs susceptible to soft errors. Therefore, the RF data arrays integrate parity checking hardware (Fig. 20.5.3) to provide soft error detection. Parity generation logic is embedded locally inside each register. ...

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

E. Fetzer

Clock distribution on a dual-core, multi-threaded Itanium-family processor

A 32 nm, 3.1 Billion Transistor, 12 Wide Issue Itanium® Processor for Mission-Critical Servers

A fully bypassed six-issue integer datapath and register file on the Itanium-2 microprocessor

The Parity Protected, Multithreaded Register Files on the 90-nm Itanium Microprocessor

The multi-threaded, parity-protected 128-word register files on a dual-core Itanium-family processor

Contact Info

Product

Resources

About