This paper considers fuzzy co-clustering of distributed cooccurrence data, where vertically partitioned cooccurrence information among objects and items are stored in several sites. In order to utilize such distributed data sets without fear of information leaks, a privacy preserving procedure is introduced to fuzzy clustering for categorical multivariate data (FCCM). Withholding each element of cooccurrence matrices, only object memberships are shared by multiple sites and their (implicit) joint co-cluster structures are revealed through an iterative clustering process. Several experimental results demonstrate the ability of improving the individual coclustering results of each site by combining the distributed data sets.
In many real world data analysis tasks, it is expected that we can get much more useful knowledge by utilizing multiple databases stored in different organizations, such as cooperation groups, state organs, and allied countries. However, in many such organizations, they often hesitate to publish their databases because of privacy and security issues although they believe the advantages of collaborative analysis. This paper proposes a novel collaborative framework for utilizing vertically partitioned cooccurrence matrices in fuzzy co-cluster structure estimation, in which cooccurrence information among objects and items is separately stored in several sites. In order to utilize such distributed data sets without fear of information leaks, a privacy preserving procedure is introduced to fuzzy clustering for categorical multivariate data (FCCM). Withholding each element of cooccurrence matrices, only object memberships are shared by multiple sites and their (implicit) joint co-cluster structures are revealed through an iterative clustering process. Several experimental results demonstrate that collaborative analysis can contribute to revealing global intrinsic co-cluster structures of separate matrices rather than individual site-wise analysis. The novel framework makes it possible for many private and public organizations to share common data structural knowledge without fear of information leaks.
Privacy preserving data mining is a promising topic for utilizing various personal information without fear of information leaks. Fuzzy co-clustering is a fundamental technique for summarizing mutual cooccurrence information among objects and items, and has been demonstrated to be useful in such applications as document analysis and collaborative ltering. In this paper, a secure framework for privacy preserving fuzzy co-clustering is proposed for handling both vertically and horizontally distributed cooccurrence matrices. Personal observation stored in each site is summarized into co-cluster structures with an encryption operation. The advantage of utilizing distributed cooccurrence matrices is demonstrated in several numerical experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.