Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.732
|View full text |Cite
|
Sign up to set email alerts
|

A Multitask Learning Approach for Diacritic Restoration

Abstract: In many languages like Arabic, diacritics are used to specify pronunciations as well as meanings. Such diacritics are often omitted in written text, increasing the number of possible pronunciations and meanings for a word. This results in a more ambiguous text making computational processing on such text more difficult. Diacritic restoration is the task of restoring missing diacritics in the written text. Most state-of-the-art diacritic restoration models are built on character level information which helps ge… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(14 citation statements)
references
References 25 publications
0
14
0
Order By: Relevance
“…The direction of hierarchical pipelines is not necessarily always from low-level tasks to high-level tasks. For example, in [1], the outputs of word-level tasks are fed to the char-level primary task. [99] feeds the output of more general classification models to more specific classification models during training, and the more general classification results are used to optimize beam search of more specific models at test time.…”
Section: Hierarchicalmentioning
confidence: 99%
See 1 more Smart Citation
“…The direction of hierarchical pipelines is not necessarily always from low-level tasks to high-level tasks. For example, in [1], the outputs of word-level tasks are fed to the char-level primary task. [99] feeds the output of more general classification models to more specific classification models during training, and the more general classification results are used to optimize beam search of more specific models at test time.…”
Section: Hierarchicalmentioning
confidence: 99%
“…Besides, another common practice is to share the first embedding layers across tasks as [58,156] did. [1] shares word and character embedding matrices and combines them differently for different tasks. [103] shares two encoding layers and a vocabulary lookup table between the primary neural machine translation task and the Relevance-based Auxiliary Task (RAT).…”
Section: Modular Architecturesmentioning
confidence: 99%
“…It is worth noting that previous studies (Zitouni et al, 2006;Arabiyat, 2015;Fadel et al, 2019a;Abandah and Abdel-Karim, 2019;Alqahtani et al, 2019Alqahtani et al, , 2020 use different methods to compute diacritic error rate (DER) for ATB and Tashikeela datasets. Therefore, we follow the schema in Zitouni et al (2006); Arabiyat (2015); Abandah and Abdel-Karim (2019) to compute DER for ATB and follow Fadel et al (2019a); Alqahtani et al (2019Alqahtani et al ( , 2020 to compute that for Tashikeela.…”
Section: Appendix a Diactritization Labelsmentioning
confidence: 99%
“…Restoring Arabic syntactic diacritics based on Long Short-Term Memory (LSTM) networks leads to state-ofthe-art performance [3,30,31,32]. These word level LSTM networks are commonly augmented with Maximum Entropy (MaxEnt) sparse direct connections between the input and the output layers of the tagger [3].…”
Section: Introductionmentioning
confidence: 99%