2021
DOI: 10.48550/arxiv.2104.12669
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Exploiting Explanations for Model Inversion Attacks

Abstract: The successful deployment of artificial intelligence (AI) in many domains from healthcare to hiring requires their responsible use, particularly in model explanations and privacy. Explainable artificial intelligence (XAI) provides more information to help users to understand model decisions, yet this additional knowledge exposes additional risks for privacy attacks. Hence, providing explanation harms privacy. We study this risk for image-based model inversion attacks and identified several attack architectures… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 31 publications
0
2
0
Order By: Relevance
“…Research has shown that the output of feature importance methods like SHAP, through repeated queries, are prone to membership attacks that can reveal intimate details about the classification boundary [69]. Similar research has shown that counterfactual explanations are vulnerable to similar attacks [3], as are image-based explanations like saliency maps [89]. Such results reveal the risk that when providing explanations to external stakeholders, the recipients of such explanations can collude to reconstruct the inner workings of a model.…”
Section: Challenge 5 Xai Techniques In General Lack Robustness and Ha...mentioning
confidence: 99%
“…Research has shown that the output of feature importance methods like SHAP, through repeated queries, are prone to membership attacks that can reveal intimate details about the classification boundary [69]. Similar research has shown that counterfactual explanations are vulnerable to similar attacks [3], as are image-based explanations like saliency maps [89]. Such results reveal the risk that when providing explanations to external stakeholders, the recipients of such explanations can collude to reconstruct the inner workings of a model.…”
Section: Challenge 5 Xai Techniques In General Lack Robustness and Ha...mentioning
confidence: 99%
“…In addition, recent works [29,33,36,43] have shown that an adversary could train a substitute model via model stealing and use it for crafting adversarial examples [12] in a black-box setting, which poses a serious threat when the model is deployed in security critical applications. A stolen model could also compromise the privacy of users by leaking confidential data through a membership inference attack [32] or model inversion [41,42]. Fig.…”
Section: Introductionmentioning
confidence: 99%