2022
DOI: 10.1145/3547139
|View full text |Cite
|
Sign up to set email alerts
|

A Critical Review on the Use (and Misuse) of Differential Privacy in Machine Learning

Abstract: We review the use of differential privacy (DP) for privacy protection in machine learning (ML). We show that, driven by the aim of preserving the accuracy of the learned models, DP-based ML implementations are so loose that they do not offer the ex ante privacy guarantees of DP. Instead, what they deliver is basically noise addition similar to the traditional (and often criticized) statistical disclosure control approach. Due to the lack of formal privacy guarantees, the actual level of… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 28 publications
(12 citation statements)
references
References 45 publications
0
12
0
Order By: Relevance
“…We would like to notice that this is not the usual range of values used in the literature. For example, in privacy-preserving data publishing, values of ϵ above 3 progressively seem to lose any meaningful guarantees [3]. However, for us, the fact that we will be using such large values is irrelevant, since we will empirically measure privacy leakage not through the ε itself, but through the effectiveness of an MIA.…”
Section: Resultsmentioning
confidence: 99%
“…We would like to notice that this is not the usual range of values used in the literature. For example, in privacy-preserving data publishing, values of ϵ above 3 progressively seem to lose any meaningful guarantees [3]. However, for us, the fact that we will be using such large values is irrelevant, since we will empirically measure privacy leakage not through the ε itself, but through the effectiveness of an MIA.…”
Section: Resultsmentioning
confidence: 99%
“…We set this value to 1 1000 in all experiments. Existing literature prescribes that the δ parameter should be smaller than one divided by the data set size (Blanco-Justicia et al 2022;Hsu et al 2014;van Dijk and Nguyen 2023). Algorithm 1 uses a privacy bound for the contacts per day, and as our simulators have a max of 200 contacts per day, and an average of only fifteen contacts per day, 1 1000 is well below the recommended standard.…”
Section: Methodsmentioning
confidence: 99%
“…The present contribution is scoped exclusively on Input Privacy techniques. Previous studies have exposed fundamental flaws of some recently proposed Output Privacy approaches aimed at sharing highly granular data (individual records, micro-data) and/or high-dimensional AI/ML models trained on personal data, see for instance Stadler and Troncoso (2022), Stadler et al (2022), Blanco-Justicia et al (2022), and Domingo-Ferrer et al (2021) and references therein. In the author’s opinion, the hurried adoption of these techniques, in certain business application contexts and without careful further scrutiny, may be more akin to privacywashing than to privacy enhancing.…”
Section: Input Privacy and Output Privacymentioning
confidence: 99%