2020
DOI: 10.1038/s42256-020-00265-z
|View full text |Cite
|
Sign up to set email alerts
|

Concept whitening for interpretable image recognition

Abstract: What does a neural network encode about a concept as we traverse through the layers? Interpretability in machine learning is undoubtedly important, but the calculations of neural networks are very challenging to understand. Attempts to see inside their hidden layers can either be misleading, unusable, or rely on the latent space to possess properties that it may not have. In this work, rather than attempting to analyze a neural network posthoc, we introduce a mechanism, called concept whitening (CW), to alter … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
200
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 214 publications
(201 citation statements)
references
References 36 publications
1
200
0
Order By: Relevance
“…The main objective of our study is to investigate models that are able to identify the key parts of an MRI image that result in AD prediction while also explaining how it comes to that particular prediction. We specifically include an interpretability layer called a prototypical part network (ProtoPNet) [11] in the models used for AD classification. This layer enables observing a model's reasoning for the predicted outcomes.…”
Section: Introductionmentioning
confidence: 99%
“…The main objective of our study is to investigate models that are able to identify the key parts of an MRI image that result in AD prediction while also explaining how it comes to that particular prediction. We specifically include an interpretability layer called a prototypical part network (ProtoPNet) [11] in the models used for AD classification. This layer enables observing a model's reasoning for the predicted outcomes.…”
Section: Introductionmentioning
confidence: 99%
“…Future work includes adapting the proposed model to other sound recognition problems, such as sound event detection and audio-tagging. Along with this, we seek to incorporate audio-domain knowledge to the development of other intrinsically interpretable neural network models based on prototypes [23,24] and concepts [22,25]. Besides, we want to use the visualization tool for manual editing to study different ways of training the network with a human-in-the-loop approach; and, to create tools for explaining the inner functionality of the network to end-users.…”
Section: Discussionmentioning
confidence: 99%
“…Yet, interpretability is often materialized in practice into a set of application-specific constraints on the model, dictated by domain knowledge, such as causality, monotonicity, or sparsity. In this paper, we propose an intrinsic type of interpretability, based on examples/prototypes, similarly to [21][22][23][24][25]. Our interpretable model should be able to produce explanations in terms of sound prototypes.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations