2021
DOI: 10.1609/aaai.v35i3.16355
|View full text |Cite
|
Sign up to set email alerts
|

Enhanced Regularizers for Attributional Robustness

Abstract: Deep neural networks are the default choice of learning models for computer vision tasks. Extensive work has been carried out in recent years on explaining deep models for vision tasks such as classification. However, recent work has shown that it is possible for these models to produce substantially different attribution maps even when two very similar images are given to the network, raising serious questions about trustworthiness. To address this issue, we propose a robust attribution training strategy to i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(11 citation statements)
references
References 21 publications
0
11
0
Order By: Relevance
“…This makes the optimization unstable. To address this issue Sarkar et al [97] used a triplet loss with softplus non-linearities. This approach focuses on pixels that attribute highly to the prediction of the true class (positive class pixels).…”
Section: Combination Of Tunedmentioning
confidence: 99%
See 3 more Smart Citations
“…This makes the optimization unstable. To address this issue Sarkar et al [97] used a triplet loss with softplus non-linearities. This approach focuses on pixels that attribute highly to the prediction of the true class (positive class pixels).…”
Section: Combination Of Tunedmentioning
confidence: 99%
“…So, contrastive learning cannot improve until focus is also given to negative and positive class pixels. Sarkar et al [97] applied this concept by using a regularizer that forces the true class distribution to be skew-shaped and the negative class to behave uniformly. They also tried to impose a bound to change in the pixel attribution weight as indistinguishable changes are made to the input via another regularizer.…”
Section: Combination Of Tunedmentioning
confidence: 99%
See 2 more Smart Citations
“…(Lipton 2018;Samek et al 2019;Fan et al 2021;Zhang et al 2020) present detailed surveys on these methods. Recently, the growing numbers of attribution methods has led to a concerted focus on studying the robustness of attributions to input perturbations to handle potential security hazards (Chen et al 2019;Sarkar, Sarkar, and Balasubramanian 2021;Wang and Kong 2022;Agarwal et al 2022). One could view these efforts as akin to adversarial robustness that focuses on defending against attacks on model predictions, whereas attributional robustness focuses on defending against attacks on model explanations.…”
Section: Introductionmentioning
confidence: 99%