K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters

Wang, Ruize; Tang, Duyu; Duan, Nan; Wei, Zhongyu; Huang, Xuanjing; ji, Jianshu; Cao, Guihong; Jiang, Daxin; Zhou, Ming

doi:10.48550/arxiv.2002.01808

Cited by 105 publications

(60 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These encyclopedia KG are able to provide abundant knowledge for PLMs to integrate. A majority of exiting work on KE-PLMs [108][26][105] [96][90] [82][66] uses Wikidata 1 as knowledge source. Typically, entities in Wikidata are linked with entity mentions in the text of Wikipedia.…”

Section: Encyclopedia Knowledgementioning

confidence: 99%

“…Though most existing approaches exploit only one knowledge source, it is worth noting that certain methods attempt to incorporate from more than one knowledge source. For example, K-Adapter [96] incorporate knowledge from multiple sources by learning a different adapter for each knowledge source. It exploits both dependency relation as linguistic knowledge and relation/fact knowledge from Wikidata.…”

Section: Sentiment Knowledgementioning

confidence: 99%

“…ERICA [66] applies contrastive loss on entity and relation, which pulls neighbor entity/relation close and pushes non-neighbors far apart in the embedding space. K-Adapter [96] introduces a factual adapter which incorporates relation information by performing relation classification based on the entity context. K-BERT [49] injects knowledge by augmenting sentence with the triplets from KG to transform it into a knowledge-rich sentence tree.…”

Section: Relation Knowledgementioning

confidence: 99%

“…Since many existing approaches incorporate entity-level knowledge, entity related tasks (e.g., entity typing and relation classification) become natural testbeds for evaluating the efficacy of these KE-PLMs. By injecting entity information Knowledge Source ERICA [66] Yes entity/relation discrimination Wikipedia, Wikidata ERNIE (THU) [108] Yes entity prediction Wikipedia/Wikidata ERNIE 2.0 (Baidu) [83] Yes masked entity/phrase N/A E-BERT [64] Yes entity/wordpiece alignment Wikipedia2Vec E(commerce)-BERT [106] Yes neighbor Product Reconstruction product graph/AutoPhrase[75] EaE [21] Yes mention detection/linking Wikipedia CokeBERT [80] Yes entity prediction Wikipedia/Wikidata COMET [6] No autoregressive ATOMIC, ConceptNet K-Adapter [96] No dependency relation Wikipedia, Wikidata, Stanford Parser KnowBERT [61] Yes entity linking WordNet, Wikipedia K-BERT [49] No finetuning WikiZh, WebtextZh, CN-DBpedia HowNet, MedicalKG KEPLER [97] Yes TransE scoring Wikipedia/Wikidata KG-BERT [102] Yes relation cross-entropy ConceptNet KG-BART [51] Yes masked concept ConceptNet KgPLM [26] Yes generative/discriminative masked entity Wikipedia/Wikidata FaE [90] Yes masked entity et.al Wikipedia/Wikidata JAKET [105] Yes entity category/relation type/masked entity Wikipedia/Wikidata LUKE [100] Yes entity prediction Wikipedia WKLM [99] Yes entity replacement detection Wikipedia/Wikidata CoLAKE [82] Yes masked entity prediction Wikipedia/Wikidata KT-NET [101] No finetuning N/A LIBERT [40] Yes lexical relation prediction WordNet SenseBERT [41] Yes supersense prediction WordNet Syntax-BERT [2] No masks induced by syntax tree parsing syntax tree SentiLARE [36] Yes POS/ word level polarity/sentiment polarity SentiWordNet [44] No finetuning ConceptNet COCOLM [104] Yes discourse relation/co-occurrence relation ASER [22] Yes autoregressive ConceptNet/ATOMIC AMS [103] Yes distractor-based loss ConceptNet GLM …”

Section: Entity-related Tasksmentioning

confidence: 99%

“…These observations motivate work on designing more knowledge-aware pre-trained models. Recently, an evergrowing body of work aims at explicitly incorporating knowledge into PLMs [100] [108] [61][90] [96][49] [33]. They exploit knowledge from various sources such as encyclopedia knowledge, commonsense knowledge and linguistic knowledge with different injection strategies.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Knowledge Enhanced Pretrained Language Models: A Compreshensive Survey

Wei,

Wang,

Zhang

et al. 2021

Preprint

View full text Add to dashboard Cite

Pretrained Language Models (PLM) have established a new paradigm through learning informative contextualized representations on large-scale text corpus. This new paradigm has revolutionized the entire field of natural language processing, and set the new state-of-the-art performance for a wide variety of NLP tasks. However, though PLMs could store certain knowledge/facts from training corpus, their knowledge awareness is still far from satisfactory. To address this issue, integrating knowledge into PLMs have recently become a very active research area and a variety of approaches have been developed. In this paper, we provide a comprehensive survey of the literature on this emerging and fast-growing field -Knowledge Enhanced Pretrained Language Models (KE-PLMs). We introduce three taxonomies to categorize existing work. Besides, we also survey the various NLU and NLG applications on which KE-PLM has demonstrated superior performance over vanilla PLMs. Finally, we discuss challenges that face KE-PLMs and also promising directions for future research.

show abstract

Section: Encyclopedia Knowledgementioning

confidence: 99%

Section: Sentiment Knowledgementioning

confidence: 99%

Section: Relation Knowledgementioning

confidence: 99%

Section: Entity-related Tasksmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Knowledge Enhanced Pretrained Language Models: A Compreshensive Survey

Wei,

Wang,

Zhang

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

CokeBERT: Contextual knowledge selection and embedding towards enhanced pre-trained language models

Han

Zhang

et al. 2021

AI Open

View full text Add to dashboard Cite

Evaluating Knowledge Fusion Models on Detecting Adverse Drug Events in Text

Wegner,

Fröhlich,

Madan

2024

Preprint

View full text Add to dashboard Cite

Background Detecting adverse drug events (ADE) of drugs that are already available on the market is an essential part of the pharmacovigilance work conducted by both medical regulatory bodies and the pharmaceutical industry. Concerns regarding drug safety and economic interests serve as motivating factors for the efforts to identify ADEs. Hereby, social media platforms play an important role as a valuable source of reports on ADEs, particularly through collecting posts discussing adverse events associated with specific drugs. Methodology We aim with our study to assess the effectiveness of knowledge fusion approaches in combination with transformer-based NLP models to extract ADE mentions from diverse datasets, for instance, texts from Twitter, websites like askapatient.com, and drug labels. The extraction task is formulated as a named entity recognition (NER) problem. The proposed methodology involves applying fusion learning methods to enhance the performance of transformer-based language models with additional contextual knowledge from ontologies or knowledge graphs. Additionally, the study introduces a multi-modal architecture that combines transformer-based language models with graph attention networks (GAT) to identify ADE spans in textual data. Results A multi-modality model consisting of the ERNIE model with knowledge on drugs reached an F1-score of 71.84% on CADEC corpus. Additionally, a combination of a graph attention network with BERT resulted in an F1-score of 65.16% on SMM4H corpus. Impressively, the same model achieved an F1-score of 72.50% on the PSYTAR corpus, 79.54% on the ADE corpus, and 94.15% on the TAC corpus. Except for the CADEC corpus, the knowledge fusion models consistently outperformed the baseline model, BERT. Conclusion Our study demonstrates the significance of context knowledge in improving the performance of knowledge fusion models for detecting ADEs from various types of textual data.

show abstract

K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters

Cited by 105 publications

References 33 publications

Knowledge Enhanced Pretrained Language Models: A Compreshensive Survey

Knowledge Enhanced Pretrained Language Models: A Compreshensive Survey

CokeBERT: Contextual knowledge selection and embedding towards enhanced pre-trained language models

Evaluating Knowledge Fusion Models on Detecting Adverse Drug Events in Text

Contact Info

Product

Resources

About