2012
DOI: 10.1016/j.procs.2012.09.050
|View full text |Cite
|
Sign up to set email alerts
|

Towards A Differential Privacy and Utility Preserving Machine Learning Classifier

Abstract: Many organizations transact in large amounts of data often containing personal identifiable information (PII) and various confidential data. Such organizations are bound by state, federal, and international laws to ensure that the confidentiality of both individuals and sensitive data is not compromised. However, during the privacy preserving process, the utility of such datasets diminishes even while confidentiality is achieved--a problem that has been defined as NP-Hard. In this paper, we investigate a diffe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
29
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 37 publications
(29 citation statements)
references
References 18 publications
0
29
0
Order By: Relevance
“…This method does not need to carry out discrete preprocessing of data. Literature [20], Mivule et al proposed a framework that uses AdaBoost iterations to update data sets until the forest achieves an acceptable level of prediction accuracy. However, the framework lacks detail in that it does not give the specific content of the differential privacy technology, or how it allocates the privacy budget.…”
Section: Related Workmentioning
confidence: 99%
“…This method does not need to carry out discrete preprocessing of data. Literature [20], Mivule et al proposed a framework that uses AdaBoost iterations to update data sets until the forest achieves an acceptable level of prediction accuracy. However, the framework lacks detail in that it does not give the specific content of the differential privacy technology, or how it allocates the privacy budget.…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, in this study, we focused on using input perturbation to perform differentially private classification task. To reach our goal, we adopted input perturbation technique of differential privacy as used in the studies of (Mivule et al, 2012; Sánchez et al, 2016; Sarwate & Chaudhuri, 2013; Senekane, 2019) to perform privacy preserving classification. We experimentally analyzed the performances of the well‐known classification algorithms that are C4.5, Naïve Bayes, Bayesian Networks, IBk, K*, One Rule, PART, Random tree, and Ripper for classification of the differentially private data, obtained by applying input perturbation to 8 widely used UCI datasets for various privacy levels by changing the ɛ values from 1 to 5, and also for small ɛ values (i.e., ɛ <1).…”
Section: Differentially Private Classificationmentioning
confidence: 99%
“…Input perturbation : In this technique, the data is perturbed by adding noise to the values of its numerical attributes for a certain privacy level (i.e., ɛ ).Definition Let x be a d ‐dimensional vector of a data instance in a database D , differentially private version of x can be given as in Equation (5).xpriv=x+z where z is a d ‐dimensional vector with a Laplace density probability function given in Equation ). With this noise addition to each individual data vector x i in the database D , it can be guaranteed that the resulting database D priv = ( x priv 1 , x priv 2 , x priv 3 , …, x privn ) is an ɛ ‐differentially private approximation to D (Antonova, 2016; Dwork et al, 2006; Mivule et al, 2012; Sarwate & Chaudhuri, 2013; Senekane, 2019).…”
Section: Definitions and Formulations Of Differential Privacymentioning
confidence: 99%
See 1 more Smart Citation
“…Many research have been done by scholars from home and abroad. Documentary [ 4 ] proposed the issue of balance between availability and privacy of differential privacy protection. As differential privacy protection is a data distortion technique, the balance between availability and privacy of differential privacy protection is an NP problem.…”
Section: Introductionmentioning
confidence: 99%