2022
DOI: 10.48550/arxiv.2205.10511
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Improving Long Tailed Document-Level Relation Extraction via Easy Relation Augmentation and Contrastive Learning

Abstract: Towards real-world information extraction scenario, research of relation extraction is advancing to document-level relation extraction(DocRE). Existing approaches for DocRE aim to extract relation by encoding various information sources in the long context by novel model architectures. However, the inherent long-tailed distribution problem of DocRE is overlooked by prior work. We argue that mitigating the long-tailed distribution problem is crucial for DocRE in the real-world scenario. Motivated by the long-ta… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 17 publications
(27 reference statements)
0
3
0
Order By: Relevance
“…With the introduction of DocRED (Yao et al, 2019), many approaches were proposed to extract relations from a document (Wang et al, 2019;Ye et al, 2020;Zhang et al, 2021;Xu et al, 2021;Zhou et al, 2021;Xie et al, 2022). The long-tailed data problem of DocRE has been addressed in some studies (Du et al, 2022;Tan et al, 2022a), as well as low-resource DocRE (Zhou et al, 2023); however, most require additional pretraining, which is compute-and cost-intensive, while PRiSM only requires adjusting logits in existing models. Lowresource RE has been extensively studied at the sentence level, and we specifically focus on leveraging label information (Yang et al, 2020;Dong et al, 2021;Zhang and Lu, 2022) in which PRiSM applies it to the document level.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…With the introduction of DocRED (Yao et al, 2019), many approaches were proposed to extract relations from a document (Wang et al, 2019;Ye et al, 2020;Zhang et al, 2021;Xu et al, 2021;Zhou et al, 2021;Xie et al, 2022). The long-tailed data problem of DocRE has been addressed in some studies (Du et al, 2022;Tan et al, 2022a), as well as low-resource DocRE (Zhou et al, 2023); however, most require additional pretraining, which is compute-and cost-intensive, while PRiSM only requires adjusting logits in existing models. Lowresource RE has been extensively studied at the sentence level, and we specifically focus on leveraging label information (Yang et al, 2020;Dong et al, 2021;Zhang and Lu, 2022) in which PRiSM applies it to the document level.…”
Section: Related Workmentioning
confidence: 99%
“…We argue that the reason is twofold. First, the long-tailed distribution of DocRE data encourages models to be overly confident in predicting frequent relations and less sure about infrequent ones (Du et al, 2022;Tan et al, 2022a). Out of the 96 relations in DocRED (Yao et al, 2019), a widely-used DocRE dataset, the 7 most frequent relations account for 55% of the total relation triples.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation