All Internet routers contain buffers to hold packets during times of congestion. Today, the size of the buffers is determined by the dynamics of TCP's congestion control algorithm. In particular, the goal is to make sure that when a link is congested, it is busy 100% of the time; which is equivalent to making sure its buffer never goes empty. A widely used rule-of-thumb states that each link needs a buffer of size B = RT T × C, where RT T is the average round-trip time of a flow passing across the link, and C is the data rate of the link. For example, a 10Gb/s router linecard needs approximately 250ms × 10Gb/s = 2.5Gbits of buffers; and the amount of buffering grows linearly with the line-rate. Such large buffers are challenging for router manufacturers, who must use large, slow, off-chip DRAMs. And queueing delays can be long, have high variance, and may destabilize the congestion control algorithms. In this paper we argue that the rule-of-thumb (B = RT T × C) is now outdated and incorrect for backbone routers. This is because of the large number of flows (TCP connections) multiplexed together on a single backbone link. Using theory, simulation and experiments on a network of real routers, we show that a link with n flows requires no more than B = (RT T × C)/ √ n, for long-lived or short-lived TCP flows. The consequences on router design are enormous: A 2.5Gb/s link carrying 10,000 flows could reduce its buffers by 99% with negligible difference in throughput; and a 10Gb/s link carrying 50,000 flows requires only 10Mbits of buffering, which can easily be implemented using fast, on-chip SRAM.
--High performance packet switches frequently use a centralized scheduler (also known as an arbiter) to determine the configuration of a non-blocking crossbar. The scheduler often limits the scalability of the system because of the frequency and complexity of its decisions. A recent paper by C.-S. Chang et al. introduces an interesting two-stage switch, in which each stage uses a trivial deterministic sequence of configurations. The switch is simple to implement at high speed and has been proved to provide 100% throughput for a broad class of traffic. Furthermore, there is a bound between the average delay of the two-stage switch and that of an ideal output-queued switch. However, in its simplest form, the switch mis-sequences packets by an arbitrary amount. In this paper, building on the two-stage switch, we present an algorithm called Full Frames First (FFF), that prevents mis-sequencing while maintaining the performance benefits (in terms of throughput and delay) of the basic two-stage switch.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.