A New Privacy-Preserving Distributed <i>k</i>-Clustering Algorithm

Jagannathan, Geetha; Pillaipakkamnatt, Krishnan; Wright, Rebecca N.

doi:10.1137/1.9781611972764.47

Cited by 84 publications

(55 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, these techniques typically do not work directly on the actual perturbed data (like our technique), but attempt to reconstruct the original data distribution using the known noise distribution that has been added on the dataset [1,17]. Privacy preservation can also be achieved through limited dataset view, for example, by horizontal or vertical distribution of the data to different sites [21,9,25]. In our setting, the dataset cannot be dissected in portions, but is being distributed as a whole.…”

Section: Methodology and Difference From Previous Workmentioning

confidence: 99%

Ownership protection of shape datasets with geodesic distance preservation

Vlachos¹,

Lucchese

Rajan³

et al. 2008

Proceedings of the 11th International Conference on Extending Database Technology Advances in Database Technology - EDBT '08

View full text Add to dashboard Cite

Protection of one's intellectual property is a topic with important technological and legal facets. The significance of this issue is amplified nowadays due to the ease of data dissemination through the internet. Here, we provide technological mechanisms for establishing the ownership of a dataset consisting of multiple objects. The objects that we consider in this work are shapes (i.e., two dimensional contours), which abound in disciplines such as medicine, biology, anthropology and natural sciences. The protection of the dataset is achieved through means of embedding of an imperceptible ownership 'seal', that imparts only minute visual distortions. This seal needs to be embedded in the proper data space so that its removal or destruction is particularly difficult. Our technique is robust to many common transformations, such as data rotation, translation, scaling, noise addition and resampling. In addition to that, the proposed scheme also guarantees that important distances between the dataset shapes/objects are not distorted. We achieve this by preserving the geodesic distances between the dataset objects. Geodesic distances capture a significant part of the dataset structure, and their usefulness is recognized in many machine learning, visualization and clustering algorithms. Therefore, if a practitioner uses the protected dataset as input to a variety of mining, machine learning, or database operations, the output will be the same as on the original dataset. We illustrate and validate the applicability of our methods on image shapes extracted from anthropological and natural science data.

show abstract

Section: Methodology and Difference From Previous Workmentioning

confidence: 99%

Ownership protection of shape datasets with geodesic distance preservation

Vlachos¹,

Lucchese

Rajan³

et al. 2008

Proceedings of the 11th International Conference on Extending Database Technology Advances in Database Technology - EDBT '08

View full text Add to dashboard Cite

show abstract

“…A stand-alone approach to privacy-preserving imputation can therefore be used in combination with any existing privacy-preserving data mining algorithm for the same distributed setting. In particular, our results in this paper are suitable for use with any privacy-preserving data mining algorithm for data that is horizontally partitioned between two parties (e.g., [23,20,18]). …”

Section: Related Workmentioning

confidence: 98%

Privacy-preserving imputation of missing data

Jagannathan

Wright

2008

Data & Knowledge Engineering

Self Cite

View full text Add to dashboard Cite

“…Contrary to the above approaches, we do not attempt to reconstruct the original data distribution but work directly on the perturbed data, while guaranteeing preservation of distance properties on them. Privacy-protection via dataset partition is achieved using horizontal or vertical data partitioning [29,12,30,31]. Different portions of the data are distributed to different sites, and data exchange without leakage of private information becomes possible through cryptographic techniques (multiparty computation).…”

Section: Related Workmentioning

confidence: 99%

Right-protected data publishing with hierarchical clustering preservation

Vlachos

Wieczorek²,

Schneider

2012

Proceedings of the 21st ACM International Conference on Information and Knowledge Management

View full text Add to dashboard Cite

The emergence of cloud-based storage services is opening up new avenues in data exchange and data dissemination. This has amplified the interest in right-protection mechanisms to establish ownership in case of data leakage. Current right-protection technologies, however, rarely provide strong guarantees on the dataset utility after the protection process. This work presents techniques that explicitly address this shortcoming and provably preserve the outcome of certain mining operations. In particular, we take special care to guarantee that the outcome of hierarchical clustering operations remains the same before and after right protection. We encode data ownership using watermarking principles. In the process, we derive fundamental bounds on the distortion incurred by the watermarking. We leverage our theoretical analysis to design fast algorithms for right protection without exhaustively searching the vast design space.

show abstract

A New Privacy-Preserving Distributed k-Clustering Algorithm

Cited by 84 publications

References 18 publications

Ownership protection of shape datasets with geodesic distance preservation

Ownership protection of shape datasets with geodesic distance preservation

Privacy-preserving imputation of missing data

Right-protected data publishing with hierarchical clustering preservation

Contact Info

Product

Resources

About