Rare Tokens Degenerate All Tokens: Improving Neural Text Generation via Adaptive Gradient Gating for Rare Token Embeddings

Yu, Sangwon; Song, Jongyoon; Kim, Heeseung; Lee, Seong-min; Ryu, Woo-Jong; Yoon, Sungroh

doi:10.48550/arxiv.2109.03127

Search citation statements

Order By: Relevance

Paper Sections

Select...

Introduction1

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

(1 citation statement)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This fundamental NLP task can benefit various applications, such as Information Extraction [3,4], Question Answering [5,6], Machine Translation [7,8], and Summarization [9,10], which are of great research value. Coref requires document-level encoding.…”

Section: Introductionmentioning

confidence: 99%

Coreference Resolution Based on High-Dimensional Multi-Scale Information

Wang,

Ding,

Wang

et al. 2024

Entropy

View full text Add to dashboard Cite

Coreference resolution is a key task in Natural Language Processing. It is difficult to evaluate the similarity of long-span texts, which makes text-level encoding somewhat challenging. This paper first compares the impact of commonly used methods to improve the global information collection ability of the model on the BERT encoding performance. Based on this, a multi-scale context information module is designed to improve the applicability of the BERT encoding model under different text spans. In addition, improving linear separability through dimension expansion. Finally, cross-entropy loss is used as the loss function. After adding BERT and span BERT to the module designed in this article, F1 increased by 0.5% and 0.2%, respectively.

show abstract