2021
DOI: 10.48550/arxiv.2105.12837
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Fooling Partial Dependence via Data Poisoning

Abstract: Many methods have been developed to understand complex predictive models and high expectations are placed on post-hoc model explainability. It turns out that such explanations are not robust nor trustworthy, and they can be fooled. This paper presents techniques for attacking Partial Dependence (plots, profiles, PDP), which are among the most popular methods of explaining any predictive model trained on tabular data. We showcase that PD can be manipulated in an adversarial manner, which is alarming, especially… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(12 citation statements)
references
References 29 publications
0
12
0
Order By: Relevance
“…• 𝐿 𝑝 Distance: An intuitive and straightforward metric to compare two explanation maps is to compute the normed distance between them. Some of the widely used metrics are median 𝐿 1 distance, used in [15], and the 𝐿 2 distance [17,36,107]. Mean Squared Error (MSE), a metric derived from 𝐿 2 distance, is also very popular.…”
Section: Evaluating the Robustness Of Explanation Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…• 𝐿 𝑝 Distance: An intuitive and straightforward metric to compare two explanation maps is to compute the normed distance between them. Some of the widely used metrics are median 𝐿 1 distance, used in [15], and the 𝐿 2 distance [17,36,107]. Mean Squared Error (MSE), a metric derived from 𝐿 2 distance, is also very popular.…”
Section: Evaluating the Robustness Of Explanation Methodsmentioning
confidence: 99%
“…(3) Trust attack (Figure 2c) leads to different predictions but similar explanations. (3) Tabular data (ML models) [17] -dataset is poisoned to conceal the suspected behavior (bias) . Deep neural network models are currently used as black boxes due to their high accuracy for various tasks.…”
Section: Attacking Explainability Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Let us consider the example of the Heart Disease dataset 1 , which deals with the binary classification for the presence of a heart disease. A Support Vector Classification model was built for this dataset and then a technique described by Baniecki et al (2021) was applied to perturb the dataset in a way that alters the PDP curve, which could be done maliciously to influence the result. The new perturbed dataset has the same dimension as the original dataset.…”
Section: Context-sensitive Explanationsmentioning
confidence: 99%
“…Recently, several studies reported that such a manipulation is possible, e.g. by modifying the black-box model to be explained [8,9,10], by manipulating the computation algorithms of feature attributions [11,12], and by poisoning the data distribution [13]. With these findings in mind, the current possible advice to the auditors is not to rely solely on the reported feature attributes for fairness auditing.…”
Section: Introductionmentioning
confidence: 99%