2022
DOI: 10.48550/arxiv.2205.15419
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Fooling SHAP with Stealthily Biased Sampling

Abstract: SHAP explanations aim at identifying which features contribute the most to the difference in model prediction at a specific input versus a background distribution. Recent studies have shown that they can be manipulated by malicious adversaries to produce arbitrary desired explanations. However, existing attacks focus solely on altering the black-box model itself. In this paper, we propose a complementary family of attacks that leave the model intact and manipulate SHAP explanations using stealthily biased samp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 6 publications
0
2
0
Order By: Relevance
“…On the other hand, black boxes can effortlessly attain high performance but their decision mechanisms are opaque and hard to understand by both experts and non-experts. Also, post-hoc explanations of these complex models have been shown to be unreliable and highly manipulable by ill-intentioned entities (Aïvodji et al, 2019;Slack et al, 2020;Dimanov et al, 2020;Laberge et al, 2022;Aïvodji et al, 2021). This conundrum between black-box or transparent designs is colloquially referred to as the "accuracy-transparency trade-off", that is, one has to choose between transparent models with lower performance or opaque models that perform well but whose explanations are not trustworthy.…”
Section: Introductionmentioning
confidence: 99%
“…On the other hand, black boxes can effortlessly attain high performance but their decision mechanisms are opaque and hard to understand by both experts and non-experts. Also, post-hoc explanations of these complex models have been shown to be unreliable and highly manipulable by ill-intentioned entities (Aïvodji et al, 2019;Slack et al, 2020;Dimanov et al, 2020;Laberge et al, 2022;Aïvodji et al, 2021). This conundrum between black-box or transparent designs is colloquially referred to as the "accuracy-transparency trade-off", that is, one has to choose between transparent models with lower performance or opaque models that perform well but whose explanations are not trustworthy.…”
Section: Introductionmentioning
confidence: 99%
“…To settle the above shortcomings, we adopt machine learning methods and design deep learning models to predict anesthesia recovery time. We also adopt a machine learning interpretation toolkit, such as the SHAP toolkit [22,23], to facilitate anesthesiologists to judge the importance of features. Our main contributions to this work are summarized in the following.…”
Section: Introductionmentioning
confidence: 99%