Summary
With the rapid spread of services in the Cloud computing environment, it is difficult for users to find the right service. Therefore, the necessity of a search engine with semantic focused crawler becomes a fundamental requirement. However, the huge size and varied functionalities of Cloud services on the Web, together with the lack of standardized and coherent description of services available, have a great effect for crawlers in order to provide effective Cloud services. To solve these issues, we propose a Cloud service discovery crawler that employs a two‐level semantic similarity measure based on both TF‐IDF and LDA models. Moreover, in order to automatically discover and categorize Cloud services, we present a Cloud Service Ontology (CSOnt) that contains a set of concepts defining Cloud service categories. Experimental results show that the proposed method enhances the performance of the focused crawlers and presents an efficient way to parse the Web and collect Web pages relevant to Cloud services.