Past progress on neural models has proven that named entity recognition is no longer a problem if we have enough labeled data. However, collecting enough data and annotating them are labor-intensive, time-consuming, and expensive. In this paper, we decompose the sentence into two parts: entity and context, and rethink the relationship between them and model performance from a causal perspective. Based on this, we propose the Counterfactual Generator, which generates counterfactual examples by the interventions on the existing observational examples to enhance the original dataset. Experiments across three datasets show that our method improves the generalization ability of models under limited observational examples. Besides, we provide a theoretical foundation by using a structural causal model to explore the spurious correlations between input features and output labels. We investigate the causal effects of entity or context on model performance under both conditions: the non-augmented and the augmented. Interestingly, we find that the non-spurious correlations are more located in entity representation rather than context representation. As a result, our method eliminates part of the spurious correlations between context representation and output labels. The code is available at https://github.com/xijiz/cfgen.
In order to better understand the reason behind model behaviors (i.e., making predictions), most recent work has exploited generative models to provide complementary explanations. However, existing approaches in natural language processing (NLP) mainly focus on "WHY A" rather than contrastive "WHY A NOT B", which is shown to be able to better distinguish confusing candidates and improve model performance in other research fields. In this paper, we focus on generating Contrastive Explanations with counterfactual examples in NLI and propose a novel Knowledge-Aware generation framework (KACE). Specifically, we first identify rationales (i.e., key phrases) from input sentences, and use them as key perturbations for generating counterfactual examples. After obtaining qualified counterfactual examples, we take them along with original examples and external knowledge as input, and employ a knowledge-aware generative pre-trained language model to generate contrastive explanations. Experimental results show that contrastive explanations are beneficial to clarify the difference between predicted answer and other answer options. Moreover, we train an BERT-large based NLI model enhanced with contrastive explanations and achieve an accuracy of 91.9% on SNLI, gaining an improvement of 5.7% against ETPA ("Explain-Then-Predict-Attention") and 0.6% against NILE ("WHY A"). * Work is done during internship at Alibaba Group. † The work is mainly conducted while being at Alibaba Group.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.