Cloud computing technology has attracted the attention of researchers and organizations due to its computing power, computing efficiency and flexibility. Using cloud computing technology to analysis outsourced data has become a new data utilization model. However, due to the severe security risks that appear in cloud computing, most organizations now encrypt data before outsourcing data. Therefore, in recent years, many new works on the k-Nearest Neighbor (denoted by k-NN) algorithm for encrypted data has appeared. However, two main problems are existing in the current research: either the program is not secure enough or inefficient. In this paper, based on the existing problems, we design a non-interactive privacy-preserving k-NN query and classification scheme. Our proposed scheme uses two existing encryption schemes: Order Preserving Encryption and the Paillier cryptosystem, to preserve the confidentiality of encrypted outsourced data, data access patterns, and the query record, and utilizes the encrypted the k-dimensional tree (denoted by kd-tree) to optimize the traditional k-NN algorithm. Our proposed scheme strives to achieve high query efficiency while ensuring data security. Extensive experimental results prove that this scheme is very close to the scheme using plaintext data and the existing non-interactive encrypted data query scheme in terms of classification accuracy. The query runtime of our scheme is superior to the existing non-interactive k-NN query scheme.INDEX TERMS Privacy preservation, k-nearest neighbors, k-dimensional tree, outsourced data.
This paper proposes a heuristic algorithm for fast mining association rules by multidimensional scaling (MDS). It takes the similarity measurements as the MDS proximities and develops a practical MDS model to generate decentralized configuration of points that represent the stops on vehicle routes. This algorithm extends the SMACOF algorithm by the steps of grouping and join. The experiments show that the novel algorithm has much higher efficiency than the Apriori algorithm especially when mining association rules of long patterns in transportation system.
Density-based spatial clustering algorithm DBSCAN has a relatively low efficiency since it carries out a large number of useless distance computing; Grid-based spatial clustering algorithm is more efficient, but the clustering result has a low accuracy. Considering the advantage and disadvantages of the two algorithms, this paper proposes a grid and density based fast clustering algorithm GNDBSCAN. This algorithm performs density-based clustering on datasets space, which has been divided by grids. It improves the efficiency of clustering and at the same time, maintains high accuracy for clustering results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.