Findings of the Association for Computational Linguistics: ACL 2022 2022
DOI: 10.18653/v1/2022.findings-acl.150
|View full text |Cite
|
Sign up to set email alerts
|

Dict-BERT: Enhancing Language Model Pre-training with Dictionary

Abstract: Pre-trained language models (PLMs) aim to learn universal language representations by conducting self-supervised training tasks on largescale corpus. Since PLMs capture word semantics in different contexts, the quality of word representations highly depends on word frequency, which usually follows a heavy-tailed distribution in the pre-training corpus. Thus, the embeddings of rare words on the tail are usually poorly optimized. In this work, we focus on enhancing language model pre-training by leveraging defin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
36
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3
2

Relationship

2
8

Authors

Journals

citations
Cited by 33 publications
(36 citation statements)
references
References 22 publications
0
36
0
Order By: Relevance
“…Another research line aims to boost the performance of large-scale LLMs to higher levels by leveraging the self-generated content. Some works use the self-generation as contexts to assist themselves in answering questions, such as eliciting intermediate rationales as CoT (Kojima et al, 2023) or generating background articles for reading comprehension (Yu et al, 2023). While others instruct LLMs to generate demonstrations for ICL during inference , such as prompting LLMs to generate reliable QA pairs as self-prompted in-context demonstrations .…”
Section: Model Enhancement Via Llm Generationmentioning
confidence: 99%
“…Another research line aims to boost the performance of large-scale LLMs to higher levels by leveraging the self-generated content. Some works use the self-generation as contexts to assist themselves in answering questions, such as eliciting intermediate rationales as CoT (Kojima et al, 2023) or generating background articles for reading comprehension (Yu et al, 2023). While others instruct LLMs to generate demonstrations for ICL during inference , such as prompting LLMs to generate reliable QA pairs as self-prompted in-context demonstrations .…”
Section: Model Enhancement Via Llm Generationmentioning
confidence: 99%
“…Incorporating external knowledge is essential for many NLG tasks to augment the limited textual information (Yu et al, 2022c;Dong et al, 2021;Yu et al, 2022b). Some recent work explored using graph neural networks (GNN) to reason over multihop relational knowledge graph (KG) paths (Zhou et al, 2018;Jiang et al, 2019;Zhang et al, 2020a;Wu et al, 2020;Yu et al, 2022a;Zeng et al, 2021).…”
Section: Knowledge Graph For Text Generationmentioning
confidence: 99%
“…Finally, we devise a denoising auto-encoder-style learning objective and train the network to reconstruct selective masked sentence parts. Our use of symbolic knowledge (Yu et al, 2021) of IEs to aid the learning of their embeddings results in the model needing a significantly small amount of data (∼60MB) compared to that required for LM pre-training (∼160GB of text for BART).…”
Section: All At Seamentioning
confidence: 99%