A fully integrated 40-Gb/s transceiver is implemented in a 0.13-µm CMOS technology. This paper describes the challenges in designing a 20-GHz input sampler, a 20-GHz quadrature LC-VCO, a 20-GHz bang-bang phase detector, and a 40-Gb/s equalizer. The transceiver occupies 1.7 × 2.9mm 2 and dissipates 3.6W from a 1.45-V supply. With the equalizer on, the transmit jitter of the 39-Gb/s 2 15 -1 PRBS data is 1.85ps rms over a wire-bonded plastic ball grid array (PBGA) package, an 8-mm RO-4350B PCB trace, an on-board 2.4-mm connector, and a 1m-long 2.4-mm coaxial cable, while the recovered clock jitter is 1.77 ps rms . The measured BER is < 10 -14 .Introduction This paper presents a fully integrated 40-Gb/s transceiver implemented in the 0.13-µm CMOS technology, including a 32:1 serializer, a 1:32 deserializer, a 20-GHz clock generator PLL, and a binary clock and data recovery PLL. The transceiver extends the circuit techniques previously described in [1, 2] that can offer the operating frequency of the 40-Gb/s transmitter and the 20-GHz PLL in a CMOS process with f T of less than 70 GHz, which is at most half as high as the competing SiGe, GaAs, or InP processes [3]. The techniques include single-transformer based shunt-and-double-series inductive peaking, negative feedback for bandwidth extension, and the use of pulsed latches for fast timing closure. We extend these techniques to build a complete transceiver with new challenges being addressed in this paper: a half-rate input sampler for the 40-Gb/s data stream, a 20-GHz quadrature LC-VCO, a low-latency phase detector for the bang-bang controlled clock recovery PLL, and a simple 40-Gb/s equalizer for channel loss compensation.Receiver Architecture A block diagram of the implemented 40-Gb/s deserializing receiver is shown in Fig. 1. The receiver is composed of a linear equalizer, four half-rate samplers, a 2:32 multi-stage deserializer, a PRBS verifier, and a bang-bang clock recovery PLL. The four 20-GHz samplers latch the input voltage at the center and edge of the incoming 40-Gb/s NRZ data stream. The timing of the samplers is controlled by 20-GHz quadrature clocks. The phase detection logic following the samplers extracts the polarity of the phase error to adjust the VCO clock timing. The 2:32 deserializer consists of four cascaded 2:1 multiplexing stages that convert the 2-bit 20-Gb/s retimed data to 32-bit-wide, 1.25-Gb/s data. The clock frequencies required by the different stages are generated by a chain of divide-by-2 dividers. The frequency detector compares the last divided clock with a 625-MHz reference clock to assist the main phase-detection loop in acquiring the initial frequency lock.Building Blocks The proposed 20-GHz sampler shown in Fig. 2 uses a pulsed-latch based design to shorten the latency of a latch. The inductive peaking and conditional negative feedback