Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume 2021
DOI: 10.18653/v1/2021.eacl-main.65
|View full text |Cite
|
Sign up to set email alerts
|

Self-Training Pre-Trained Language Models for Zero- and Few-Shot Multi-Dialectal Arabic Sequence Labeling

Abstract: A sufficient amount of annotated data is usually required to fine-tune pre-trained language models for downstream tasks. Unfortunately, attaining labeled data can be costly, especially for multiple language varieties and dialects. We propose to self-train pre-trained language models in zero-and few-shot scenarios to improve performance on data-scarce varieties using only resources from data-rich ones. We demonstrate the utility of our approach in the context of Arabic sequence labeling by using a language mode… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 27 publications
0
4
0
Order By: Relevance
“…Moreover, the study of Khalifa et al (2021), had argued the issue of the high cost associated with achieving labelled data, particularly in the case of numerous different languages and accents. The authors stated that fine-tuning of the pre-trained language models for downstream tasks needs an adequate quantity of data being annotated.…”
Section: Few-shot and Zero-shot Text Classificationmentioning
confidence: 99%
“…Moreover, the study of Khalifa et al (2021), had argued the issue of the high cost associated with achieving labelled data, particularly in the case of numerous different languages and accents. The authors stated that fine-tuning of the pre-trained language models for downstream tasks needs an adequate quantity of data being annotated.…”
Section: Few-shot and Zero-shot Text Classificationmentioning
confidence: 99%
“…They do not assume pre-segmentation of the text, however, they only consider the core POS tag, rather than the fully specified morphosyntactic tag. Khalifa et al (2021) proposed a self-training approach for core POS tagging where they iteratively improve the model by incorporating the predicted examples into the training set used for fine-tuning.…”
Section: Related Workmentioning
confidence: 99%
“…Fine-tuning pre-trained language models like BERT (Devlin et al, 2019) has achieved great success in a wide variety of natural language processing (NLP) tasks, e.g., sentiment analysis (Abu Farha et al, 2021), question answering (Antoun et al, 2020), named entity recognition (Ghaddar et al, 2022), and dialect identification (Abdelali et al, 2021). Pre-trained LMs have also been used for enabling technologies such as part-of-speech (POS) tagging (Lan et al, 2020;Khalifa et al, 2021; to produce features for downstream processes. Previous POS tagging results using pre-trained LMs focused on core POS tagsets; however, it is still not clear how these models perform on the full morphosyntactic tagging task of very morphologically rich languages, where the size of the full tagset can be in the thousands.…”
Section: Introductionmentioning
confidence: 99%
“…To take profit from the provided unlabeled dataset in this shared task, we generate a weakly-annotated dataset and re-train the developed model on it. This method has been applied differently in several works (Khalifa et al, 2021;El Mekki et al, 2021a;Huang et al, 2021). In our work, we apply the following pipeline:…”
Section: Self-trainingmentioning
confidence: 99%