The aim of dimensionality reduction is to construct a low-dimensional representation of high dimensional input data in such a way, that important parts of the structure of the input data are preserved. This paper proposes to apply the dimensionality reduction to intrusion detection data based on the parallel Lanczos-SVD (PLSVD) with the cloud technologies. The massive input data is stored on distribution files system, like HDFS. And the Map/Reduce method is used for the parallel analysis on many cluster nodes. Our experiment results show that, compared with the PCA algorithm, PLSVD algorithm has better scalability and flexibility.
Vertical partitioning is a process of generating fragments, each of which is composed of attributes with high affinity. It is widely used in the distributed database to improve the efficiency of system by reducing the connection between the table access operations. The current research on vertical partitioning is mainly focused on how to measure the "affinity" to get the best-fit vertical partitioning and the n-way vertical partitioning which support generating the specific number of fragments required by the user. In this paper, we propose a vertical partitioning algorithm based on privacy constraint. It supports both the best-fit vertical partitioning and the n-way vertical partitioning. It also provides the data privacy protection by privacy constraint checking. We conduct several experimental results to show that our algorithm not only keeps higher efficiency, but also provides data privacy protection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.