Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1173
|View full text |Cite
|
Sign up to set email alerts
|

Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling

Abstract: Morphological tagging is challenging for morphologically rich languages due to the large target space and the need for more training data to minimize model sparsity. Dialectal variants of morphologically rich languages suffer more as they tend to be more noisy and have less resources. In this paper we explore the use of multitask learning and adversarial training to address morphological richness and dialectal variations in the context of full morphological tagging. We use multitask learning for joint morpholo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
21
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2
1

Relationship

3
5

Authors

Journals

citations
Cited by 17 publications
(21 citation statements)
references
References 33 publications
0
21
0
Order By: Relevance
“…The difference between their work and that in (Zalmout and Habash, 2017) is the use of a joint model to learn morphological features other than diacritics (or features at the word level), rather than learning these features individually. Zalmout and Habash (2019a) obtained an additional boost in performance (0.3% improvement over ours) when they add a dialect variant of Arabic in the learning process, sharing information between both languages. Alqahtani and Diab (2019a) provides comparable performance to ALL and better performance on some task combinations in terms of WER on all and OOV words.…”
Section: Input Representationmentioning
confidence: 65%
See 1 more Smart Citation
“…The difference between their work and that in (Zalmout and Habash, 2017) is the use of a joint model to learn morphological features other than diacritics (or features at the word level), rather than learning these features individually. Zalmout and Habash (2019a) obtained an additional boost in performance (0.3% improvement over ours) when they add a dialect variant of Arabic in the learning process, sharing information between both languages. Alqahtani and Diab (2019a) provides comparable performance to ALL and better performance on some task combinations in terms of WER on all and OOV words.…”
Section: Input Representationmentioning
confidence: 65%
“…However, Zalmout and Habash (2017)'s model performs significantly better on OOV words. Zalmout and Habash (2019a) provides comparable performance to ALL model. The difference between their work and that in (Zalmout and Habash, 2017) is the use of a joint model to learn morphological features other than diacritics (or features at the word level), rather than learning these features individually.…”
Section: Input Representationmentioning
confidence: 89%
“…The tagging architecture is similar to the architecture presented by Zalmout and Habash (2019). We use two Bi-LSTM layers on the word level to model the context for each direction of the target word.…”
Section: Taggermentioning
confidence: 99%
“…Whereas to get the a j vector, for each morphological feature f , we use a morphological analyzer to obtain all possible feature values of the word to be analyzed. We then embed each value separately (with separate embedding tensors for each feature, learnt within the model), then sum all the resulting vectors to to get a f j (since these tags are alternatives and do not constitute a sequence) (Zalmout and Habash, 2019). We concatenate the individual a f j vectors for each morphological feature f of each word, to get a single representation, a j , for all the features:…”
Section: Taggermentioning
confidence: 99%
“…Later, a number of annotation efforts have led to the creation of varying sizes of dialectal annotated corpora following the style of the PATB (Maamouri et al, 2014;Jarrar et al, 2016;Al-Shargi et al, 2016;Alshargi et al, 2019). The created annotations supported models for dialectal Arabic analysis, disambiguation and tokenization building on the same successful approaches in MSA (Eskander et al, 2016a;Habash et al, 2013;Pasha et al, 2014;Zalmout and Habash, 2019). More closely related to this paper, Eldesouki et al (2017) used de-lexicalized analy-sis strategy for four colloquial varieties of Arabic, though they also use minimal training data and extract features from an open class lexicon to learn either an SVM or bi-LSTM-CRF disambiguation model.…”
Section: Dialectal Arabic Models Work On Dialectalmentioning
confidence: 99%