ERNIE: Enhanced Language Representation with Informative Entities

Zhang, Zhengyan; Han, Xu; Liu, Zhiyuan; Jiang, Xin; Sun, Maosong; Li, Qun

doi:10.18653/v1/p19-1139

Cited by 1,069 publications

(728 citation statements)

References 43 publications

Supporting

Mentioning

643

Contrasting

Unclassified

Order By: Relevance

“…F1 (Kolitsas et al, 2018) language understanding benchmark GLUE (Wang et al, 2018), the question answering (QA) benchmarks SQUAD V2 (Rajpurkar et al, 2018) and SWAG (Zellers et al, 2018), and the machine translation benchmark EN-DE WMT14. We confirm the finding from Zhang et al (2019) that additional entity knowledge is not beneficial for the GLUE benchmark. To our surprise, we also find that additional entity knowledge is neither helpful for the two QA datasets nor for machine translation.…”

Section: Introductionsupporting

confidence: 87%

Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking

Broscheit¹

2019

Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

View full text Add to dashboard Cite

show abstract

Section: Introductionsupporting

confidence: 87%

Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking

Broscheit¹

2019

Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

View full text Add to dashboard Cite

show abstract

“…The baselines we compare against are BERT BASE , BERT LARGE , the pre-BERT state of the art, and two contemporaneous papers that add similar types of knowledge to BERT. ERNIE (Zhang et al, 2019) uses TAGME (Ferragina and Scaiella, 2010) to link entities to Wikidata, retrieves the associated entity embeddings, and fuses them into BERT BASE by fine-tuning. Soares et al (2019) Relation extraction Our first task is relation extraction using the TACRED (Zhang et al, 2017) and SemEval 2010 Task 8 (Hendrickx et al, 2009) datasets.…”

Section: Downstream Tasksmentioning

confidence: 99%

Knowledge Enhanced Contextual Word Representations

Peters¹,

Neumann²,

Logan

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

517

331

View full text Add to dashboard Cite

Contextual word representations, typically trained on unstructured, unlabeled text, do not contain any explicit grounding to real world entities and are often unable to remember facts about those entities. We propose a general method to embed multiple knowledge bases (KBs) into large scale models, and thereby enhance their representations with structured, human-curated knowledge. For each KB, we first use an integrated entity linker to retrieve relevant entity embeddings, then update contextual word representations via a form of word-to-entity attention. In contrast to previous approaches, the entity linkers and selfsupervised language modeling objective are jointly trained end-to-end in a multitask setting that combines a small amount of entity linking supervision with a large amount of raw text. After integrating WordNet and a subset of Wikipedia into BERT, the knowledge enhanced BERT (KnowBert) demonstrates improved perplexity, ability to recall facts as measured in a probing task and downstream performance on relationship extraction, entity typing, and word sense disambiguation. KnowBert's runtime is comparable to BERT's and it scales to large KBs.

show abstract

“…Though we can train a usable and stable RE system based on the above-mentioned scenarios, which can well predict those relations appearing frequently in data, some long-tail relations with few instances in data are still neglected. Recently, some methods have been proposed to provide a different view of this problem by formalizing RE as a few-shot learning problem (Han et al, 2018c;Gao et al, 2019;Ye and Ling, 2019;Soares et al, 2019;Zhang et al, 2019). As shown in Figure 1, each relation only have a handful of instances in the supporting set in a few-shot RE scenario, and models are required to be capable of accurately capturing relation patterns of these small amounts of training instances.…”

Section: Few-shot Relation Extractionmentioning

confidence: 99%

OpenNRE: An Open and Extensible Toolkit for Neural Relation Extraction

Han¹,

Gao²,

Yao³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

Self Cite

100

View full text Add to dashboard Cite

OpenNRE is an open-source and extensible toolkit that provides a unified framework to implement neural models for relation extraction (RE). Specifically, by implementing typical RE methods, OpenNRE not only allows developers to train custom models to extract structured relational facts from the plain text but also supports quick model validation for researchers. Besides, OpenNRE provides various functional RE modules based on both TensorFlow and PyTorch to maintain sufficient modularity and extensibility, making it becomes easy to incorporate new models into the framework. Besides the toolkit, we also release an online system to meet real-time extraction without any training and deploying. Meanwhile, the online system can extract facts in various scenarios as well as aligning the extracted facts to Wikidata, which may benefit various downstream knowledge-driven applications (e.g., information retrieval and question answering). More details of the toolkit and online system can be obtained from

show abstract

ERNIE: Enhanced Language Representation with Informative Entities

Cited by 1,069 publications

References 43 publications

Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking

Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking

Knowledge Enhanced Contextual Word Representations

OpenNRE: An Open and Extensible Toolkit for Neural Relation Extraction

Contact Info

Product

Resources

About