Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing 2017
DOI: 10.18653/v1/d17-1073
|View full text |Cite
|
Sign up to set email alerts
|

Don't Throw Those Morphological Analyzers Away Just Yet: Neural Morphological Disambiguation for Arabic

Abstract: This paper presents a model for Arabic morphological disambiguation based on Recurrent Neural Networks (RNN). We train Long Short-Term Memory (LSTM) cells in several configurations and embedding levels to model the various morphological features. Our experiments show that these models outperform state-of-theart systems without explicit use of feature engineering. However, adding learning features from a morphological analyzer to model the space of possible analyses provides additional improvement. We make use … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
74
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
3
1

Relationship

4
5

Authors

Journals

citations
Cited by 46 publications
(76 citation statements)
references
References 22 publications
2
74
0
Order By: Relevance
“…Other research, however, argue that enriching embeddings with additional morphological information boosts performance. [13] demonstrates this by using the results of a morphological analyzer to further improve candidate ranking in a morphological disambiguation task for Arabic. In a research for Burmese word segmentation, [14] address the problem by employing binary classification with classifiers such as CRFs.…”
Section: Related Workmentioning
confidence: 98%
“…Other research, however, argue that enriching embeddings with additional morphological information boosts performance. [13] demonstrates this by using the results of a morphological analyzer to further improve candidate ranking in a morphological disambiguation task for Arabic. In a research for Burmese word segmentation, [14] address the problem by employing binary classification with classifiers such as CRFs.…”
Section: Related Workmentioning
confidence: 98%
“…We also use weighted matching; where instead of assigning ones and zeros for the matched/mismatched features, we use a featurespecific matching weight. We replicate the morphological disambiguation pipeline presented in earlier contributions (Zalmout and Habash, 2017;, and use the same parameter values and feature weights.…”
Section: Full Morphological Disambiguationmentioning
confidence: 99%
“…BLIND TEST FULL FEATS DIAC LEX POS FULL FEATS DIAC LEX POS MADAMIRAMSA (Pasha et al, 2014) 85. (Zalmout and Habash, 2017) 90 77 Embedding Models Joint embedding spaces between the dialects, whether through embedding space mapping or through learning the embeddings on the combined corpus, did not perform well. Using separate embedding models (whether for word or character embeddings) for each dialect shows better accuracy.…”
Section: Dev Testmentioning
confidence: 99%
“…Arabic diacritization, which can be considered forms of text normalization, has received a number of neural efforts (Belinkov and Glass, 2015;Abandah et al, 2015). However, state-of-the-art approaches for end-to-end text normalization rely on several additional models and rule-based approaches as hybrid models (Pasha et al, 2014;Nawar, 2015;Zalmout and Habash, 2017), which introduce direct human knowledge into the system, but are limited to correcting specific mistakes and rely on expert knowledge to be developed.…”
Section: Related Workmentioning
confidence: 99%