Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1139
|View full text |Cite
|
Sign up to set email alerts
|

ERNIE: Enhanced Language Representation with Informative Entities

Abstract: Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks. However, the existing pre-trained language models rarely consider incorporating knowledge graphs (KGs), which can provide rich structured knowledge facts for better language understanding. We argue that informative entities in KGs can enhance language representation with external knowledge. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

7
643
0
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
5

Relationship

1
9

Authors

Journals

citations
Cited by 1,069 publications
(728 citation statements)
references
References 43 publications
7
643
0
1
Order By: Relevance
“…F1 (Kolitsas et al, 2018) language understanding benchmark GLUE (Wang et al, 2018), the question answering (QA) benchmarks SQUAD V2 (Rajpurkar et al, 2018) and SWAG (Zellers et al, 2018), and the machine translation benchmark EN-DE WMT14. We confirm the finding from Zhang et al (2019) that additional entity knowledge is not beneficial for the GLUE benchmark. To our surprise, we also find that additional entity knowledge is neither helpful for the two QA datasets nor for machine translation.…”
Section: Introductionsupporting
confidence: 87%
“…F1 (Kolitsas et al, 2018) language understanding benchmark GLUE (Wang et al, 2018), the question answering (QA) benchmarks SQUAD V2 (Rajpurkar et al, 2018) and SWAG (Zellers et al, 2018), and the machine translation benchmark EN-DE WMT14. We confirm the finding from Zhang et al (2019) that additional entity knowledge is not beneficial for the GLUE benchmark. To our surprise, we also find that additional entity knowledge is neither helpful for the two QA datasets nor for machine translation.…”
Section: Introductionsupporting
confidence: 87%
“…The baselines we compare against are BERT BASE , BERT LARGE , the pre-BERT state of the art, and two contemporaneous papers that add similar types of knowledge to BERT. ERNIE (Zhang et al, 2019) uses TAGME (Ferragina and Scaiella, 2010) to link entities to Wikidata, retrieves the associated entity embeddings, and fuses them into BERT BASE by fine-tuning. Soares et al (2019) Relation extraction Our first task is relation extraction using the TACRED (Zhang et al, 2017) and SemEval 2010 Task 8 (Hendrickx et al, 2009) datasets.…”
Section: Downstream Tasksmentioning
confidence: 99%
“…Though we can train a usable and stable RE system based on the above-mentioned scenarios, which can well predict those relations appearing frequently in data, some long-tail relations with few instances in data are still neglected. Recently, some methods have been proposed to provide a different view of this problem by formalizing RE as a few-shot learning problem (Han et al, 2018c;Gao et al, 2019;Ye and Ling, 2019;Soares et al, 2019;Zhang et al, 2019). As shown in Figure 1, each relation only have a handful of instances in the supporting set in a few-shot RE scenario, and models are required to be capable of accurately capturing relation patterns of these small amounts of training instances.…”
Section: Few-shot Relation Extractionmentioning
confidence: 99%