Neural Natural Language Inference Models Enhanced with External Knowledge

Qian, Chen; Zhu, Xiaodan; Ling, Zhen-Hua; Inkpen, Diana; Wei, Si

doi:10.18653/v1/p18-1224

Cited by 222 publications

(166 citation statements)

References 48 publications

Supporting

Mentioning

166

Contrasting

Order By: Relevance

“…Task-specific KB architectures Other work has focused on integrating KBs into neural architectures for specific downstream tasks (Yang and Mitchell, 2017;Sun et al, 2018;Chen et al, 2018;Bauer et al, 2018;Mihaylov and Frank, 2018;Wang and Jiang, 2019;Yang et al, 2019). Our approach instead uses KBs to learn more generally transferable representations that can be used to improve a variety of downstream tasks.…”

Section: Related Workmentioning

confidence: 99%

“…Then, the model recontextualizes the entity-span representations with word-toentity attention to allow long range interactions between contextual word representations and all entity spans in the context. The entire KAR is inserted between two layers in the middle of a pretrained model such as BERT. In contrast to previous approaches that integrate external knowledge into task-specific models with task supervision (e.g., Yang and Mitchell, 2017;Chen et al, 2018), our approach learns the entity linkers with self-supervision on unlabeled data. This results in general purpose knowledge enhanced representations that can be applied to a wide range of downstream tasks.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Knowledge Enhanced Contextual Word Representations

Peters¹,

Neumann²,

Logan

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

517

331

View full text Add to dashboard Cite

Contextual word representations, typically trained on unstructured, unlabeled text, do not contain any explicit grounding to real world entities and are often unable to remember facts about those entities. We propose a general method to embed multiple knowledge bases (KBs) into large scale models, and thereby enhance their representations with structured, human-curated knowledge. For each KB, we first use an integrated entity linker to retrieve relevant entity embeddings, then update contextual word representations via a form of word-to-entity attention. In contrast to previous approaches, the entity linkers and selfsupervised language modeling objective are jointly trained end-to-end in a multitask setting that combines a small amount of entity linking supervision with a large amount of raw text. After integrating WordNet and a subset of Wikipedia into BERT, the knowledge enhanced BERT (KnowBert) demonstrates improved perplexity, ability to recall facts as measured in a probing task and downstream performance on relationship extraction, entity typing, and word sense disambiguation. KnowBert's runtime is comparable to BERT's and it scales to large KBs.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Knowledge Enhanced Contextual Word Representations

Peters¹,

Neumann²,

Logan

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

517

331

View full text Add to dashboard Cite

show abstract

“…Future work includes the following directions: (1) We plan to explore approaches for effectively representing and incorporating external knowledge (Chen et al, 2018b) in the ESIM model and the BERT model, such as knowledge graph and user profile. It is important to advance the understanding of how to effectively represent the interactions between the context and the external knowledge for the response selection task.…”

Section: Resultsmentioning

confidence: 99%

Sequential neural networks for noetic end-to-end response selection

Chen

Wang

2020

Computer Speech & Language

Self Cite

View full text Add to dashboard Cite

The noetic end-to-end response selection challenge as one track in the 7th Dialog System Technology Challenges (DSTC7) aims to push the state of the art of utterance classification for real world goal-oriented dialog systems, for which participants need to select the correct next utterances from a set of candidates for the multi-turn context. This paper presents our systems that are ranked top 1 on both datasets under this challenge, one focused and small (Advising) and the other more diverse and large (Ubuntu). Previous state-of-the-art models use hierarchybased (utterance-level and token-level) neural networks to explicitly model the interactions among different turns' utterances for context modeling. In this paper, we investigate a sequential matching model based only on chain sequence for multi-turn response selection. Our results demonstrate that the potentials of sequential matching approaches have not yet been fully exploited in the past for multi-turn response selection. In addition to ranking top 1 in the challenge, the proposed model outperforms all previous models, including state-of-the-art hierarchy-based models, on two large-scale public multi-turn response selection benchmark datasets. Keywords:DSTC7, response selection, ESIM, BERT, end-to-end, sequential matching approaches 1. We develop an Enhanced Sequential Inference Model (ESIM) based system for the DSTC7 noetic end-to-end response selection track. On top of the ESIM model, we explore methods for exploiting multiple word embeddings, heuristic data augmentation, tuning the ratio between positive and negative samples, and emphasizing the importance of the most recent context utterances. 2. We propose a two-step approach for selecting the next utterance from a large amount of candidates (i.e., for subtask 2 on the Ubuntu dataset, we need to select the next utterance from a candidate pool of 120,000 sentences), by first using a sentence-encoding based method to select the top N candidates from the large set of candidates and then reranking them using ESIM, achieving a high performance with an acceptable overall computational cost. 3. We conduct systematic ablation analysis of the above-mentioned methods for enhancing the ESIM model performance. In particular, we develop effective and efficient model ensemble by averaging the output from models

show abstract

“…These two attempts show a direction towards solving medical NLI problem where the pretrained embeddings are fine-tuned on medical corpus and are used in the state-of-the-art NLI architecture. Chen et al (2018) proposed the use of external knowledge to help enrich neural-network based NLI models by applying Knowledge-enriched coattention, Local inference collection with Exter-nal Knowledge, and Knowledge-enchanced inference composition components. Another line of solution tries to bring in the extra domain knowledge from sources like Unified Medical Language System (UMLS) (Bodenreider, 2004).…”

Section: Introductionmentioning

confidence: 99%

“…Another line of solution tries to bring in the extra domain knowledge from sources like Unified Medical Language System (UMLS) (Bodenreider, 2004). Romanov and Shivade (2018) used the knowledge-directed attention based methods in (Chen et al, 2018) for Medical NLI. Another such attempt is made by , where they incorporate domain knowledge in terms of the definitions of medical concepts from UMLS with the state-of-the-art NLI model ESIM (Chen et al, 2017) and vanilla word embeddings of Glove (Pennington et al, 2014) and fastText (Bojanowski et al, 2017).…”

Section: Introductionmentioning

confidence: 99%

Incorporating Domain Knowledge into Medical NLI using Knowledge Graphs

Sharma¹,

Santra²,

Jana³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

Recently, biomedical version of embeddings obtained from language models such as BioELMo have shown state-of-the-art results for the textual inference task in the medical domain. In this paper, we explore how to incorporate structured domain knowledge, available in the form of a knowledge graph (UMLS), for the Medical NLI task. Specifically, we experiment with fusing embeddings obtained from knowledge graph with the state-of-theart approaches for NLI task, which mainly rely on contextual word embeddings. We also experiment with fusing the domain-specific sentiment information for the task. Experiments conducted on MedNLI dataset clearly show that this strategy improves the baseline BioELMo architecture for the Medical NLI task 1 .

show abstract

Neural Natural Language Inference Models Enhanced with External Knowledge

Cited by 222 publications

References 48 publications

Knowledge Enhanced Contextual Word Representations

Knowledge Enhanced Contextual Word Representations

Sequential neural networks for noetic end-to-end response selection

Incorporating Domain Knowledge into Medical NLI using Knowledge Graphs

Contact Info

Product

Resources

About