Traditional traffic identification methods based on wellknown port numbers are not appropriate for the identification of new types of Internet applications. This paper proposes a new method to identify current Internet traffic, which is a preliminary but essential step toward traffic characterization. We categorized most current network-based applications into several classes according to their traffic patterns. Then, using this categorization, we developed a flow grouping method that determines the application name of traffic flows. We have incorporated our method into NG-MON, a traffic analysis system, to analyze Internet traffic between our enterprise network and the Internet, and characterized all the traffic according to their application types.Keywords: Passive traffic monitoring and analysis, application-level traffic identification, application-level traffic characterization, Internet traffic, streaming traffic, peer-to-peer traffic. Manuscript received Apr. 8, 2004; revised Aug. 10, 2004. This work was in part supported by the Electrical and Computer Engineering Division at POSTECH under the BK21 program of Ministry of Education, and the Program for the Training of Graduate Students in Regional Innovation of Ministry of Commerce, Industry and Energy of the Korean Government.Myung-Sup Kim (phone: +82 54 279 5654, email: mount@postech.ac.kr), Young J. Won (email: yjwon@postech.ac.kr) and James Won-Ki Hong (email: jwkhong@postech.ac.kr) are with the DPNM Laboratory, POSTECH, Pohang, Korea.
I. IntroductionIn addition to the fundamental and traditional purposes of traffic analysis such as network planning, network problem detection, and network usage reporting, traffic monitoring and analysis is required in many areas to improve network service quality such as in abnormal traffic detection and usage-based accounting. However, to come up with an evolution of the Internet in terms of underlying technologies and user services, network traffic monitoring and analysis techniques should be improved in terms of system architecture and analysis methodology. Two critical problems exist in today's Internet traffic monitoring and analysis. The first problem is how to handle an increased and massive amount of traffic data generated from high-speed network links, such as 2.5 Gbps and higher, in a real-time manner [1]- [5]. The other problem is how to analyze sophisticated traffic data generated from various newly emerging network-based applications such as streaming media, peer-topeer (P2P), and game applications [6]-[8].Regarding the second problem, the types and patterns of current network traffic are more complex than they were in the past. In the past network environment, most Internet traffic was occupied by HTTP, FTP, TELNET, SMTP, and NNTP. Today, the proportion of these well-known port-based traffic types is decreasing. Instead, P2P, streaming media, and game traffic are increasing. Internet2 administrators report that about 4% of the traffic carried by their network is P2P traffic, while a further 54% of unident...