The CPM captures rising and falling edge delay on alternating Soraya G'iasi Tuyet Nguyn,NrmaJams,ichaclock cycles. The core CPM is 90x36pm2 and the nest CPM is 90x48gtm2 in 65nm SOI. There are 24 CPMs distributed across the IBM, Austin TX microprocessor ( Fig. 22.1.2): 8 in each core and 8 in the nest.Because of the time-to-digital nature of the output, its sensitivity Scaling has caused an inreas inpresvaron andithsnsi to multiple variables, and its distribution across the microprocestivity of cycle tile to workload and envlronemental conditions, sor, the CPM measures local power-supply droop, clock instability, making it difficult to predict the cycle tilme of lmicroprocessors. prcs vaitin NBIaderyaigefcs n eprtr , , , 1 , ,^.,~~~~~p rocess variation, NBTI and early aging effects, and temperature Cycle time is determined by the required performance with an in addition to timing; although, it is not always possible to sepaadded timing margin determined by the acceptable yield. After rate these effects. manufacture, microprocessors are binned into performance categories to account for process variation, but because of the influ- Figure 22.1.3 shows the measured delay versus voltage on nomience of the workload on cycle time, there is a danger of losing per-nal parts for the core CPM paths. The paths are normalized to formance with overly conservative timing margins. A critical-path demonstrate the different slopes of each delay path. There is some monitor (CPM) that measures critical-path delay and the effects of divergence in the pass-gate and wire delay paths from the MOS noise and localized VDD droops on timing is designed as part of the paths at the edges of the operating range. The small variation in POWER6 TM microprocessor. The CPM also measures across-chip POERMmirprocess or.riationt TealCe wealsou mea sacosm s-chi the wire-delay path demonstrates that for even large percentages process varIation anh rnd detectseartifm. u of wire delay, the MOS delay variation dominates the path delay. Figure 22.1.4 shows the average maximum frequency of the , microprocessor versus the measured bit position of the adder path the microprouessor,smanym difere timingy pathns my be criticeal, for the CPMs in core 0 at each voltage. The curve is generated byThe CPM uses a small number of delay paths with different delay running a heavy workload and increasing the frequency until failversus process, voltage, and temperature (DvPVT) curves to synure. If the CPM exactly tracks the critical path, the bit position at thesize the critical paths. It is a time-to-digital converter that uses failure should not change. There is an average of three bits of rise the system clock as the reference signal for conversion. The CPM, in the bit position as voltage rises, indicating the adder path does shown in Fig. 22.1.1, is composed of an edge-launching latch, not exactly match the critical path. None of the paths exactly delay-synthesis block, edge detector, data-analysis block, and contrack the critical path, but because the output is a the...
POWER6 TM is a dual-core microprocessor fabricated in a 65nm SOI process with 10 levels of low-dielectric copper interconnects. The die, shown in Fig. 16.7.1, measures 341mm 2 , contains over 700M transistors, delivers clock frequencies exceeding 5GHz in high-performance applications, and consumes less than 100W in power-sensitive applications [1]. Chips with split and connected core power supplies are fabricated, modeled, and tested, showing both the advantages and disadvantages of each, with important implications for chips with large numbers of cores. One of the power grid designs has the two processor cores on isolated logic power boundaries. The other design has both cores tied into the rest of the chip (called the nest) on both the chip and package.There are advantages and disadvantages for each of the two power grid designs. The split cores allow for independent voltagetuning optimizing power versus performance. The manufactured die has systematic and non-systematic variation across the chip that can make one core faster and have higher leakage than the other, but both cores run at the same clock frequency on POWER6. With separate power domains, the voltage can be lowered on the core with faster circuits. This is done on previous generations of PowerPC microprocessors [2]. Another advantage of split cores is supporting power down modes for an unused core. The disadvantage of the split power grid is that the cores do not benefit from being connected with the relatively quiet nest. The cores consume considerably more power and have much higher dI/dt than the nest, which is made up of mostly level-2 cache and I/O. With cores and nest connected, the cores get the benefit of sharing the quiet on-chip nest capacitance and also share a lower-inductance path to the package decoupling capacitors, further reducing power noise in the cores. Figure 16.7.2 is a simple schematic illustrating the mid-frequency characteristics of the chip and package power distribution. In the figure, R1, C1, L1, and R2 represent the package, which, for the POWER6 package, has a 125MHz resonant frequency. C2 is the intrinsic wire and device capacitance. R3 and C3 represent the added on-chip decoupling capacitance. R4 represents the (nonlinear) leakage and R5 is used to cause the current step in simulation. A single POWER6 core is capable of causing a 13W power step within about 20 clock cycles. Detailed chip-package simulation predicts this causes a 130mV power droop when the cores are split. When the cores are tied, this droop is cut in half.
No abstract
Timing uncertainty in microprocessors is comprised of several sources including PLL jitter, clock distribution skew and jitter, across chip device variations, and power supply noise. The on-chip measurement macro called SKITTER (SKew+jITTER) was designed to measure timing uncertainty from all combined sources by measuring the number of logic stages that complete in a cycle. This measure of completed delay stages has proven to be a very sensitive monitor of power supply noise, which has emerged as a dominant component of timing uncertainty. This paper describes the Skitter measurement experiences of several IBM microprocessors including PPC970MP, XBOX360 TM , CELL Broadband Engine TM , and POWER6 TM microprocessors running different workloads.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.