Summary
Cloud storage is a widely utilized service for both personal and enterprise demands. However, despite its advantages, many potential users with enormous amounts of sensitive data (big data) refrain from fully utilizing the cloud storage service due to valid concerns about data privacy. An established solution to the cloud data privacy problem is to perform encryption on the client‐end. This approach, however, restricts data processing capabilities (eg, searching over the data). Accordingly, the research problem we investigate is how to enable real‐time searching over the encrypted big data in the cloud. In particular, semantic search is of interest to clients dealing with big data. To address this problem, in this research, we develop a system (termed S3BD) for searching big data using cloud services without exposing any data to cloud providers. To keep real‐time response on big data, S3BD proactively prunes the search space to a subset of the whole dataset. For that purpose, we propose a method to cluster the encrypted data. An abstract of each cluster is maintained on the client‐end to navigate the search operation to appropriate clusters at the search time. Results of experiments, carried out on real‐world big datasets, demonstrate that the search operation can be achieved in real‐time and is significantly more efficient than other counterparts. In addition, a fully functional prototype of S3BD is made publicly available.