To address both desktop and battery markets using the same design, a register file (RF) that generates its own internal timing, accurately tracking process and temperature, as well as power supply variation from 0.7V to 1.2V is presented. The 6write, 10-read, 34 word x 64b RF is part of FR-V, a high-performance VLIW processor [1]. The proposed RF generates all internal timing from a single clock edge for a write followed by a read operation within one clock cycle. Although previous memories use bit-line replicas for sense amplifier timing [2,3], the current design replicates the entire write and read timing path, eliminating the need for tuning self-timed signals and improving circuit reliability. Supply voltage, V dd , can be statically or dynamically stepped down from 1.2V to 0.7V to reduce power dissipation. Additionally, a separate power supply, V ddr , is provided for the array to allow a low-leakage sleep mode in which the RF maintains its state with V dd shut off. During low-voltage operation, V ddr is stepped down from 1.2V to 1.05V. Voltage conversion between V dd at 0.7V and V ddr at 1.05 is implicit in the dynamic gates without static power loss.To keep the RF small despite its large port count, single-rail bit lines are used for both write and read (Figure 25.4.1). For simplicity only one write and one read port are shown in Figure 25.4.1. M 1 , M 2 , and M 3 are replicated for each write port, and M 4 and M 5 are replicated for each read port. The cell inverters are powered from V ddr . Write word lines are also powered from V ddr to enhance writes at low-voltage operation since V ddr is higher than V dd . Read word lines as well as read and write bit lines are powered from V dd . Write operations are asymmetric -writing a "1" is faster than writing a "0". Since the RF supports a writethrough capability, write operations complete only when node bitBf bar has settled. To compensate for the write asymmetry, I 3 is driven by bit instead of bit bar, to favor the slower write "0" operation. Read uses a 17x2 dynamic OR-AND (i.e., 17 cells per 1/2 bit line connected to a static NAND) to conserve power, increase speed, and reduce bit-line leakage.At low-voltage operation V dd is 0.7V and V ddr is 1.05V. RF addresses are decoded in two stages, implemented in dynamic domino logic (Figure 25.4.2). For writes, the predecode stage is powered from V dd and precharged by pc wr and the decode/drive stage is powered from V ddr and precharged by pcdl wr , a delayed pc wr , to eliminate the footer nFET in the second stage. Both pc wr and pcdl wr are powered from V ddr to avoid static current in the delay logic and the second precharged gate. Voltage conversion from V dd at predAd to V ddr at wwl occurs implicitly as the signal passes through the final decode dynamic gate. Figure 25.4.3 shows the RF control and data signal flow. Each write and read port has a 4b control input (wc[3:0] and rc[3:0], respectively) that enables the port and determines the access width (i.e., LS bits, MS bits, or both) and a 6b address (wa[...