2021
DOI: 10.48550/arxiv.2104.02768
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Robust Semantic Interpretability: Revisiting Concept Activation Vectors

Abstract: Interpretability methods for image classification assess model trustworthiness by attempting to expose whether the model is systematically biased or attending to the same cues as a human would. Saliency methods for feature attribution dominate the interpretability literature, but these methods do not address semantic concepts such as the textures, colors, or genders of objects within an image. Our proposed Robust Concept Activation Vectors (RCAV) quantifies the effects of semantic concepts on individual model … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 20 publications
0
2
0
Order By: Relevance
“…The authors of (Kim et al 2018) define a Concept Activation Vector as the normal to a hyperplane separating samples without a concept and samples with a concept in the model's latent activations for a selected layer. This hyperplane is commonly computed by solving a classification problem, e.g., using SVMs, ridge, lasso, or logistic regression (Kim et al 2018;Pfau et al 2021;Anders et al 2022;Yuksekgonul, Wang, and Zou 2023). We refer to Appendix A.4 for details on optimizer differences.…”
Section: Choosing the Right Directionmentioning
confidence: 99%
“…The authors of (Kim et al 2018) define a Concept Activation Vector as the normal to a hyperplane separating samples without a concept and samples with a concept in the model's latent activations for a selected layer. This hyperplane is commonly computed by solving a classification problem, e.g., using SVMs, ridge, lasso, or logistic regression (Kim et al 2018;Pfau et al 2021;Anders et al 2022;Yuksekgonul, Wang, and Zou 2023). We refer to Appendix A.4 for details on optimizer differences.…”
Section: Choosing the Right Directionmentioning
confidence: 99%
“…In general, there is no guarantee that the symbolic and sub-symbolic representations Ĥ and M capture exactly the same information. This introduces a faithfulness issue, meaning that CBE explanations may not portray a reliable picture of the model's inference process [11,[48][49][50].…”
Section: Machine Representations: the Post Hoc Casementioning
confidence: 99%
“…Kim et al [13] introduce Concept Activation Vectors (CAVs) that use directional derivatives to represent human-understandable concepts from a model's activations and quantify the influence of a concept on the predictions of a single target class. Pfau et al [14] build on TCAV by providing global and local conceptual sensitivities and accounting for the non-linear influence of concepts on a model's predictions. Lucieri et al [15] explore TCAV in the context of skin lesions classification using an InceptionV4 model built by the REasoning for COmplex Data (RECOD) Lab.…”
Section: A Interpretable Aimentioning
confidence: 99%