2019
DOI: 10.48550/arxiv.1906.06449
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Robust or Private? Adversarial Training Makes Models More Vulnerable to Privacy Attacks

Abstract: Adversarial training was introduced as a way to improve the robustness of deep learning models to adversarial attacks. This training method improves robustness against adversarial attacks, but increases the models vulnerability to privacy attacks. In this work we demonstrate how model inversion attacks, extracting training data directly from the model, previously thought to be intractable become feasible when attacking a robustly trained model. The input space for a traditionally trained model is dominated by … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 11 publications
0
3
0
Order By: Relevance
“…As robust models are typically easier to invert than naturally trained models (Santurkar et al, 2019;Mejia et al, 2019), we use a robust ResNet-50 (He et al, 2016) model trained on the ImageNet (Deng et al, 2009) dataset throughout this section as a toy example to examine how different augmentations impact inversion. Note, we perform the demonstrations in this section under slightly different conditions and with different models than those ultimately used for PII in order to highlight the effects of the augmentations as clearly as possible.…”
Section: Plug-in Inversionmentioning
confidence: 99%
“…As robust models are typically easier to invert than naturally trained models (Santurkar et al, 2019;Mejia et al, 2019), we use a robust ResNet-50 (He et al, 2016) model trained on the ImageNet (Deng et al, 2009) dataset throughout this section as a toy example to examine how different augmentations impact inversion. Note, we perform the demonstrations in this section under slightly different conditions and with different models than those ultimately used for PII in order to highlight the effects of the augmentations as clearly as possible.…”
Section: Plug-in Inversionmentioning
confidence: 99%
“…Cohen et al [15] have shown the certified robustness of large scale ImageNet images and proved slight robustness for đť‘™ 2 attacks. Moreover, research suggests that adversarial training is vulnerable to black-box attacks with several privacy issues and creates blind-spots for further attacks [33,50]. In other research directions, explainability [9,14,27], transparency [25], data lineage [51], privacy and security [13], and social well being [41] have also been addressed.…”
Section: Building Trusted/trustworthy Ai Systemsmentioning
confidence: 99%
“…Except adversarial training, most of the existing defense algorithms are ineffective against optimization-based attacks [5], [10]. However, research suggests that adversarial training is vulnerable to black-box attacks with several privacy issues and creates blind-spots for further attacks [26], [41]. Recently, Agarwal et al [3] have performed a study regarding the essential components in network training and adversarial examples generation, which can further improve adversarial robustness.…”
Section: Related Workmentioning
confidence: 99%