The next-generation System z design introduces a new microprocessor chip (CP) and a system controller chip (SC) aimed at providing a substantial boost to maximum system capacity and performance compared to the previous zEC12 design in 32nm [1,2]. As shown in the die photo, the CP chip includes 8 high-frequency processor cores, 64MB of eDRAM L3 cache, interface IOs ("XBUS") to connect to two other processor chips and the L4 cache chip, along with memory interfaces, 2 PCIe Gen3 interfaces, and an I/O bus controller (GX). The design is implemented on a 678 mm 2 die with 4.0 billion transistors and 17 levels of metal interconnect in IBM's high-performance 22nm high-κ CMOS SOI technology [3]. The SC chip is also a 678 mm 2 die, with 7.1 billion transistors, running at half the clock frequency of the CP chip, in the same 22nm technology, but with 15 levels of metal. It provides 480 MB of eDRAM L4 cache, an increase of more than 2× from zEC12 [1,2], and contains an 18 MB eDRAM L4 directory, along with multi-processor cache control/coherency logic to manage inter-processor and system-level communications. Both the CP and SC chips incorporate significant logical, physical, and electrical design innovations.Systems are built from configurable nodes of tightly-coupled CP and SC chips, each packaged on single-chip modules ( Fig. 4.1.1). This structure provides improved flexibility and modularity compared to the multi-chip modules used previously. All high-speed node-to-node and drawer-to-drawer communication is through the SC chip using micro-controllers to manage the flow. Each SC chip contains over 440 of these micro controllers along with a series of wide multiplexers to manage the traffic. Both the CP and SC chips support high levels of I/O bandwidth, with about 5Tb/s total bandwidth for each CP or SC chip, running at speeds of up to 5Gb/s (single-ended) and 9.6Gb/s (differential).The CP chip adopted a unique floorplan configuration, driven by the width of the cores, which were too wide to fit four across on the die. This floorplan created significant logical and physical complexities in the L3 design, but careful engineering prevented these issues from having any meaningful impact on latency or bandwidth of the L3. The entire L3 and all 8 cores are covered with a single large "mega-mesh" clock domain, maximizing on-chip bus bandwidth. The unified mega-mesh design enables double-pumping of many on-chip buses for wider effective bandwidth, and eliminates any mesh-to-mesh timing margins in critical core-to-L3 timing paths.The CP processor core design, shown in Fig. 4.1.2, improves upon the zEC12 processor [4] with two vector execution units, significantly higher instruction-per-cycle throughput, and a new SMT2 micro-architecture supporting simultaneous execution of two threads. The microprocessor core features a wide superscalar, out-of-order pipeline that can sustain an instruction fetch, decode, dispatch and completion rate of six CISC instructions per cycle. The instruction execution path is predicted by multi-level bra...
In 2001, IBM delivered to the marketplace a high-performance UNIX ® -class eServer based on a four-chip multichip module (MCM) code named Regatta. This MCM supports four POWER4 chips, each with 170 million transistors, which utilize the IBM advanced copper back-end interconnect technology. Each chip is attached to the MCM through 7018 flip-chip solder connections. The MCM, fabricated using the IBM high-performance glass-ceramic technology, features 1.7 million internal copper vias and high-density topsurface contact pad arrays with 100-m pads on 200-m centers. Interconnections between chips on the MCM and interconnections to the board for power distribution and MCM-to-MCM communication are provided by 190 meters of co-sintered copper wiring. Additionally, the 5100 off-module connections on the bottom side of the MCM are fabricated at a 1-mm pitch and connected to the board through the use of a novel land grid array technology, thus enabling a compact 85-mm ؋ 85-mm module footprint that enables 8-to 32-way systems with processors operating at 1.1 GHz or 1.3 GHz. The MCM also incorporates advanced thermal solutions that enable 156 W of cooling per chip. This paper presents a detailed overview of the fabrication, assembly, testing, and reliability qualification of this advanced MCM technology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.