Summary
Organizations prefer using cloud for storing their data due to availability of cost‐effective storage. The outsourced data include sensitive information, so data is encrypted as maintaining confidentiality and privacy of the documents is of paramount importance. Retrieving the desired information from the cloud requires efficient searching, which involves submission of search query to the cloud server by the end‐user. As the search terms may include sensitive information of an organization, it is desired that the search query should not reveal any confidential information. The existing works are not suitable for big‐data scenario due to high search time required for large document collections, thereby leading to increased cloud usage cost. Thus, an efficient approach to perform search on encrypted data using clustering is proposed in this paper. As the proposed technique clusters the documents based on the relationship between the keywords, the search method involves searching documents within the relevant cluster in contrast to searching the entire dataset. An efficient ranking method is incorporated to rank the documents according to the relevance to search query using Term Frequency‐Inverse Document Frequency (tf‐idf) value of the keywords in the documents, which leads to reduced communication overheads due to reduction in unnecessary documents being downloaded. Moreover, an efficient query randomization approach is proposed so that two or more queries involving the same search terms appear distinct. Experimental results using real datasets demonstrate that our proposed multi‐keyword ranked search scheme on encrypted cloud data significantly reduce the number of comparisons and search time in comparison to the existing techniques while maintaining recall of 100% and precision of 82%.