Pipeline structure is considered to be the best solution to overcome LSI design limitation constraints by allowing divide and conquer design principle. In addition, it also helps us diminish wiring lengths across and thus minimizing extrinsic degradations. However, with increase in clocking rate, the structure also suffers from excessive power consumption and skew problems associated with synchronous clock distribution. A self-timed pipeline structure is proposed to solve these problems simultaneously [1]. A folded pipeline scheme employed in a folded queue (FQ) demonstrates another functional advantage of the selftimed pipeline (STP) scheme. An FQ capable of differentiating 100M packets/s streams on Diffserv basis [2] is successfully fabricated in 0.18µm CMOS process. Figure 8.1.1 shows the basic structure of an STP scheme where data transfer between the stages of a pipeline is controlled by a chain of self-timing transfer control units. Every control unit generates a clock signal to the data latches when send and ack signals are both active. That is, the send signal indicates that the processed data in the preceding stage is ready to be fed to the inputs of the latches and the ack signal shows that the succeeding stage is empty. A piece of input data traverses pipeline stages until the data arrives at an occupied stage, i.e., one in which active data are present. If a piece of data at the end of a pipeline is removed, the remaining data in each of the pipeline stages step successively to succeeding stages in a bucket relay fashion. Therefore, the asynchronous FIFO exhibits an elastic nature by adjusting the effective length of the pipeline to the amount of data stream residing in the pipeline. The STP scheme provides good design and signal integrity by virtue of localized control and wiring, even in deep submicron chips. These features are utilized in the development of self-timed super-pipelined data-driven chip-multiprocessors (DDMPs) [1].The mutual interactions among two or more STPs potentially provide various functionalities for developing SoCs. The FQ module shown in Fig. 8.1.2 is proposed as one of these extensions of STP to achieve the queueing and scheduling speed required for network processors working at over 10Gb/s. This figure shows a folded pipelined queue FQ constructed by folding a linear STP in half and attaching a shortcut path at each stage of the pipeline. The bypass stage allows a piece of data flowing at the up-stream pipeline to bypass the pipeline when a corresponding stage of the opposite down-stream pipeline is not occupied, thanks to the elastic mode of operation. In other word, the data are automatically queued in the FQ if its egress is congested under certain external conditions. Therefore, the FQ behaves flexibly along with the egress traffic condition as if it were a variable length FIFO queue.Data branching and merging transfer in each stage of the FQ are locally controlled by the TC circuits, as shown in Fig. 8.1.3. The data branching off to the shortcut pass is controlled by a...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.