Proceedings of the 2005 SIAM International Conference on Data Mining 2005
DOI: 10.1137/1.9781611972757.9
|View full text |Cite
|
Sign up to set email alerts
|

Privacy-Preserving Classification of Customer Data without Loss of Accuracy

Abstract: Privacy has become an increasingly important issue in data mining. In this paper, we consider a scenario in which a data miner surveys a large number of customers to learn classification rules on their data, while the sensitive attributes of these customers need to be protected. Solutions have been proposed to address this problem using randomization techniques. Such solutions exhibit a tradeoff of accuracy and privacy: the more each customer's private information is protected, the less accurate result the min… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
117
0

Year Published

2005
2005
2020
2020

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 150 publications
(117 citation statements)
references
References 31 publications
0
117
0
Order By: Relevance
“…In contrast, the cryptographic approaches proposed in [6], [25] provided strong privacy without loss of accuracy. The key idea of these approaches is a private frequency computation method in the fully distributed setting that allows the miner to compute frequencies of values or tuples in the data set, while preserving privacy of each user's data.…”
Section: Related Workmentioning
confidence: 94%
See 2 more Smart Citations
“…In contrast, the cryptographic approaches proposed in [6], [25] provided strong privacy without loss of accuracy. The key idea of these approaches is a private frequency computation method in the fully distributed setting that allows the miner to compute frequencies of values or tuples in the data set, while preserving privacy of each user's data.…”
Section: Related Workmentioning
confidence: 94%
“…A general definition of secure multi-party computation in the semi-honest model is stated in [8]. This definition was derived to make a simplified definition in the semi-honest model for privacy preserving data mining in the fully distributed setting scenario [6], [25]. This scenario is similar to 2PFD setting, so here we consider the possibility that some corrupted users share their data with the miner to derive the private data of the honest users, we assume that all users are semi-honest, thus any user can be corrupted.…”
Section: Definition Of Privacymentioning
confidence: 99%
See 1 more Smart Citation
“…In this section we briefly describe the work in [1], which is, to our best of knowledge, the first scheme that used a variant of the homomorphic election model in order to build a privacy preserving frequency mining algorithm. This algorithm is then used in [1] as a building block to design a protocol for naive Bayes learning.…”
Section: Reviewing the (Yang Et Al) Schemementioning
confidence: 99%
“…To this end we argue in favor of borrowing knowledge from a broad literature dealing with cryptographic elections via the Internet. We discuss some weaknesses and describe an attack on a recent PPDM scheme of Yang, Zhong and Wright [1] which, to our best knowledge, was the first work that used a variation of the classical homomorphic model [25] for online elections. Our PPDM approach will be based on the classical homomorphic model of Cramer, Gennaro and Schoenmakers [25] for online elections, and more particularly on some recent extensions proposed in [26,27] for multi-candidate elections.…”
Section: Introductionmentioning
confidence: 99%