2019
DOI: 10.1007/978-3-030-28954-6_14
|View full text |Cite
|
Sign up to set email alerts
|

The (Un)reliability of Saliency Methods

Abstract: Saliency methods aim to explain the predictions of deep neural networks. These methods lack reliability when the explanation is sensitive to factors that do not contribute to the model prediction. We use a simple and common pre-processing step -adding a constant shift to the input data-to show that a transformation with no effect on the model can cause numerous methods to incorrectly attribute. In order to guarantee reliability, we posit that methods should fulfill input invariance, the requirement that a sali… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
354
0
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
2
1

Relationship

1
9

Authors

Journals

citations
Cited by 420 publications
(358 citation statements)
references
References 9 publications
2
354
0
2
Order By: Relevance
“…Understanding and visualizing networks. Most prior work on network visualization concerns discriminative classifiers [1,3,23,26,37,38,48,50]. GANs have been visualized by examining the discriminator [32] and the semantics of internal features [4].…”
Section: Related Workmentioning
confidence: 99%
“…Understanding and visualizing networks. Most prior work on network visualization concerns discriminative classifiers [1,3,23,26,37,38,48,50]. GANs have been visualized by examining the discriminator [32] and the semantics of internal features [4].…”
Section: Related Workmentioning
confidence: 99%
“…A rich set of image-based saliency methods have been developed over the years (e.g., [24,29,14,26,3,6]). One common approach of determining salient inputs is to rely on the changes in the model output, such as gradients of the output with respect to the input features.…”
Section: Introductionmentioning
confidence: 99%
“…These attribution methods are able to integrate distributed representations throughout the network to arrive at the importance of each nucleotide variant for a given sequence. However, they only consider one sequence at a time and their scores are noisy (Kindermans et al , 2017;Adebayo et al , 2018), which can result in signicant scores for nucleotide variants that are not necessarily biologically relevant. To address these issues, TF-MoDISco splits the attribution scores of each sequence into subsequences called sequlets, clusters and aligns these sequlets, then averages each cluster to provide more interpretable representations learned by the network (Shrikumar et al , 2018).…”
Section: Introductionmentioning
confidence: 99%