Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume 2021
DOI: 10.18653/v1/2021.eacl-main.166
|View full text |Cite
|
Sign up to set email alerts
|

Multilingual Entity and Relation Extraction Dataset and Model

Abstract: We present a novel dataset and model for a multilingual setting to approach the task of Joint Entity and Relation Extraction. The SMi-LER dataset consists of 1.1 M annotated sentences, representing 36 relations, and 14 languages. To the best of our knowledge, this is currently both the largest and the most comprehensive dataset of this type. We introduce HERBERTa, a pipeline that combines two independent BERT models: one for sequence classification, and the other for entity tagging. The model achieves micro F … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 13 publications
0
3
0
Order By: Relevance
“…Contemporary to our work, other multilingual RE datasets and methods are being developed. These include a dataset for joint entity and relation extraction (Seganti et al, 2021), a model for multilingual KB completion (Singh et al, 2021), and an approach for automatic construction of crosslingual training data for Open IE (Kolluru et al, 2022). Our proposed dataset, DiS-ReX, has already been used for further research on the Multilingual DS-RE task (Rathore et al, 2022).…”
Section: Related Workmentioning
confidence: 99%
“…Contemporary to our work, other multilingual RE datasets and methods are being developed. These include a dataset for joint entity and relation extraction (Seganti et al, 2021), a model for multilingual KB completion (Singh et al, 2021), and an approach for automatic construction of crosslingual training data for Open IE (Kolluru et al, 2022). Our proposed dataset, DiS-ReX, has already been used for further research on the Multilingual DS-RE task (Rathore et al, 2022).…”
Section: Related Workmentioning
confidence: 99%
“…The multi-lingual dimension is gaining more interest for RE. Following this trend, Seganti et al (2021) presented SMiLER, a multilingual dataset (14 languages) from Wikipedia with relations belonging to nine domains.…”
Section: Relation Extraction Datasets Surveymentioning
confidence: 99%
“…The last step (RC) is usually a multi-class classification to assign a relation type r to the positive samples from the previous step. Some studies merge RI and RC (Seganti et al, 2021) into one step, by adding a no-relation (no-rel) label. Other studies instead reduce the task to RC, and assume there exists a relation between two entities and the task is to determine the type (without a no-rel label).…”
Section: The Relation Extraction Taskmentioning
confidence: 99%
See 1 more Smart Citation