AbstraciW e describe an a proach to the analysis of the performance of distributed a plications in higR-speed wide area networks. The approach is designed to i entify all of the issues that impact performance, and isolate the causes due to the related hardware and software components. W e also describe the use of a distributed parallel data server as a network load generator that can be used in conjunction with this approach to probe various aspects of high-speed distributed systems from top to bottom of the protocol stack and from end to end in the network. To demonstrate the utility of this approach we resent the this methodology. This work was done in conjunction with the ARPA-funded MAGIC gigabit testbed. dp analysis of a TCP-over-ATM problem that was uncovered while cr eveloping I he Multidimensional Applications Gigabit Internetwork Consortium (MAGIC) testbed is a large-scale, high-speed asynchronous transfer mode (ATM) network: a heterogeneous collection of ATM switches and computing platforms, several different implementations of Internet Protocol (IP) over ATM, a collection of "middleware" (distributed services), and so on, all of which must cooperate in order to make a complex application operate at high speed. As developers of high-speed network-based distributed services, we often observe unexpectedly low network throughput andlor high latency. The reason for the poor performance is frequently not obvious. The bottlenecks can be (and have been) in any of the components: the applications, the operating systems, the device drivers, the network adapters on either the sending or receiving host (or both), the network switches and routers, and so on. I t is difficult to track down performance problems because of t h e complex interaction between the many distributed system components, and the fact that problems in o n e place may be most apparent somewhere else.Network performance tools such at ttcp and netperf are commonly used t o determine the throughput between hosts on the network. While these are useful tools to startThe work descnbed in this article 1s supported by the Advanced with, we have observed many cases where ttcp performance is reasonable, but real application performance is still poor. R e a l distributed applications are complex, bursty, and have more than one connection in andlor out of a given host a t one time; tools like ttcp do not adequately simulate these conditions.We have developed a methodology and tools for monitoring, under realistic operating conditions, the behavior of all t h e elements of t h e application-to-application communications path in order to determine exactly what is happening within this complex system. W e have instrumented our applications to do timestamping and logging a t every critical point. We have also modified some of the standard UNIX network and operating system (OS) monitoring tools to log "interesting" events using a common log format. This monitoring functionality is designed to facilitate performance tuning and network performance research. This a...