Abstract-The huge popularity of recent peer-to-peer (P2P) file sharing systems has been mainly driven by the scalability of their architectures and the flexibility of their search facilities. Such systems are usually designed as unstructured P2P networks, because they impose few constraints on topology and data placement and support highly versatile search mechanisms. A major limitation of unstructured P2P networks lies, however, in the inefficiency of their search algorithms, which are usually based on simple flooding schemes.In this paper, we propose novel mechanisms for improving search efficiency in unstructured P2P networks. Unlike other approaches, we do not rely on specialized search algorithms; instead, the peers perform local dynamic topology adaptations, based on the query traffic patterns, in order to spontaneously create communities of peers that share similar interests. The basic premise of such semantic communities is that file requests have a high probability of being fulfilled within the community they originate from, therefore increasing the search efficiency. We propose further extensions to balance the load among the peers and reduce the query traffic. Extensive simulations under realistic operating conditions substantiate that our techniques significantly improve the search efficiency and reduce the network load.
I. INTRODUCTION A. MotivationsThe last few years have witnessed the appearance of a growing number of peer-to-peer (P2P) file sharing systems. Such systems make it possible to harness the resources of large populations of networked computers in a cost-effective manner, and are characterized by their high scalability.P2P file sharing systems mainly differ by their search facilities. The first hugely successful P2P data exchange system, Napster [1], incorporates a centralized search facility that keeps track of files and peer nodes; queries are executed by the central server, while the resource-demanding file transfers are performed using P2P communication. This hybrid architecture offers powerful and responsive query processing, while still scaling well to large peer populations. The central server needs, however, to be properly dimensioned to support the user query load. In addition, it constitutes a single point of failure and can easily be brought down in the face of a legal challenge, as was the case for Napster. Consequently, most recent P2P file sharing systems have adopted more decentralized architectures.Roughly speaking, the P2P networks that do not rely on a centralized directory can be classified as either structured or unstructured. Structured P2P networks (e.g., Chord [2], CAN [3],