With the rapid development of the Internet and websites, the amount of web data increases exponentially, which leads to great challenges of the traditional centralized search engine in the real-time search, response speed and the storage of mass pages. Cloud computer, which can integrate resources of many PCs to provide distributed storage and parallel computing, has many advantages in dealing with mass data. As a result, a distributed search engine on cloud platform is proposed in this paper. It consists of three main modules: crawling, indexing and retrieving. Furthermore, it also presents visual user interaction interfaces. The search engine is implemented on Hadoop, with Hadoop Distributed File System (HDFS) for storage and parallel programming model MapReduce to realize indexing and retrieving functions. The search engine on Hadoop can store mass pages with many cheap machines, retrieve the wanted information from mass data as shown in function test and reduce query time as shown in performance test, so it is economical, accurate and efficient.The search engine is a system based on certain strategies that uses specific computer programs to collect infonnation from Internet and organizes the information to provide retrieval services for users. With the rapid development of the Internet, the search engine becomes more and more important for Internet users to retrieve useful infonnation. In order to improve the speed and the accuracy of the search engine, many optimization techniques are put forward by the researchers, such as intelligent agent technology, natural language processing technology [1], semantic search [2] and so on. However, the rapid increase of the websites and online population leads to web data explosion. For example, the web pages stored in Google database have reached 30 billion. The traditional single and centralized search engine may have trouble in dealing with the massive data because it cannot achieve efficient parallel operation. Cloud computing technology develops fast in recent years. Many companies have launched their own commercial clouds, such as Google App Engine, Amazon A WS, Microsoft Windows Azure, etc. In order to facilitate academic researches, Apache proposes an open source cloud computing platfonn---Hadoop[3], which consists of two basic parts: HDFS[4] and MapReduce[5], and some other items based on them. Hadoop 978-1-4799-1392-3/13/$31.00 m013 IEEE 691has the ability to store and efficiently process massive data and its applications are greatly extensive. Moreover, Hadoop can use lots of cheap machines to achieve more reliable and secure data center. The results of [6] prove Hadoop has many advantages in storing and processing massive data. As a result, many applications [7]-[8] are deployed on Hadoop. There are also some researchers proposed the combination of search engine and cloud computing platform [9]-[13], which can solve the problem of resources and efficiency and sharply reduce the user retrieval time. The feasibility and advantages of the combination of cloud co...