Search engine is an important tool for users to access network information resources. However, a large number of duplicate and near-duplicate pages added user's burden. Currently, search engines only remove duplicate pages, but have not yet any effective strategies in detecting and disposing nearduplicate pages. This paper analyzed the existing algorithms to select an appropriate algorithm to detect near-duplicate pages,and optimized the disposing strategy to ensure that nearduplicate pages would not take up too much space in search results while being used effectively. These will allow users to retrieve needed information more easily.