Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.382
|View full text |Cite
|
Sign up to set email alerts
|

Make Up Your Mind! Adversarial Generation of Inconsistent Natural Language Explanations

Abstract: To increase trust in artificial intelligence systems, a promising research direction consists of designing neural models capable of generating natural language explanations for their predictions. In this work, we show that such models are nonetheless prone to generating mutually inconsistent explanations, such as "Because there is a dog in the image." and "Because there is no dog in the [same] image.", exposing flaws in either the decision-making process of the model or in the generation of the explanations. W… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

3
194
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 120 publications
(198 citation statements)
references
References 19 publications
3
194
0
1
Order By: Relevance
“…Hence, they can offer diagnostic strengths that help understand the model working mechanism, model debugging i.e.to identify flaws in the model and ensure whether the model works as intended. The multimodal explainability models can also identify adversarial attacks and defense mechanisms [169], fairness and bias [13], providing scope for troubleshooting, rectification, improving overall model performance, predictive and explanatory power.…”
Section: A Significance Of Explainability In Multimodal Datamentioning
confidence: 99%
“…Hence, they can offer diagnostic strengths that help understand the model working mechanism, model debugging i.e.to identify flaws in the model and ensure whether the model works as intended. The multimodal explainability models can also identify adversarial attacks and defense mechanisms [169], fairness and bias [13], providing scope for troubleshooting, rectification, improving overall model performance, predictive and explanatory power.…”
Section: A Significance Of Explainability In Multimodal Datamentioning
confidence: 99%
“…In this section, we will briefly review related works on the sentiment classification [11,13], knowledge-aware sentiment analysis [9,14], and natural language explanation [5,16,23] classification. Sentiment Analysis Sentiment analysis and emotion recognition have always attracted attention in multiple fields such as NL processing, psychology, and cognitive science.…”
Section: Related Workmentioning
confidence: 99%
“…However, the space of possible free-form explanations is incredibly large, inherently ambiguous, and difficult to annotate or evaluate (Wiegreffe et al, 2020;Latcinnik and Berant, 2020). Furthermore, quantifying the model's dependence on free-form explanations is also challenging (Camburu et al, 2020). We address these challenges by proposing an unsupervised method that uses contrastive prompts, which require the model to explicitly contrast different possible answers in its explanation (Table 1).…”
Section: Introductionmentioning
confidence: 99%