Proceedings of the Fifteenth Workshop on Computational Research in Phonetics, Phonology, and Morphology 2018
DOI: 10.18653/v1/w18-5806
|View full text |Cite
|
Sign up to set email alerts
|

Complementary Strategies for Low Resourced Morphological Modeling

Abstract: Morphologically rich languages are challenging for natural language processing tasks due to data sparsity. This can be addressed either by introducing out-of-context morphological knowledge, or by developing machine learning architectures that specifically target data sparsity and/or morphological information. We find these approaches to complement each other in a morphological paradigm modeling task in Modern Standard Arabic, which, in addition to being morphologically complex, features ubiquitous ambiguity, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 36 publications
0
3
0
Order By: Relevance
“…Finally, it should be noted that, at the time of writing this paper, Modern Greek data sets and text mining approaches are certainly fewer in number when compared against works focusing on other, more widely spoken, popular languages, such as English, German, Spanish, French, Italian, Russian or Chinese. In general, when compared to corresponding tasks in other languages, the obtained accuracy results in Modern Greek web text mining tasks indicate a better overall performance than [50][51][52][53][54][55][56][57], comparable to [58][59][60], or worse than [61][62][63].…”
Section: Approaches On Developing Modern Greek Social Web Text Data Sets and On Modernmentioning
confidence: 98%
“…Finally, it should be noted that, at the time of writing this paper, Modern Greek data sets and text mining approaches are certainly fewer in number when compared against works focusing on other, more widely spoken, popular languages, such as English, German, Spanish, French, Italian, Russian or Chinese. In general, when compared to corresponding tasks in other languages, the obtained accuracy results in Modern Greek web text mining tasks indicate a better overall performance than [50][51][52][53][54][55][56][57], comparable to [58][59][60], or worse than [61][62][63].…”
Section: Approaches On Developing Modern Greek Social Web Text Data Sets and On Modernmentioning
confidence: 98%
“…In future work, modeling semantics can fix such errors, e.g., knowing that rAkb is animate makes plural For future work, we can pre-train on raw corpora to give our model access to such information (Devlin et al, 2019). Indeed Erdmann and Habash (2018) found distributional information to benefit inflectional paradigm clustering in Arabic. Though the benefits should generalize as semantics correlates with inflection class in many languages (Wurzel, 1989;Aronoff, 1992;Harris, 1992;Noyer, 1992;Carstairs-McCarthy, 1994;Corbett and Fraser, 2000;Kastner, 2019).…”
Section: Arabic Error Analysismentioning
confidence: 99%
“…Erdmann et al (2020) established a baseline for the Paradigm Discovery Problem that clusters the unannotated sentences first by a combination of string similarity and lexical semantics and then uses this clustering as input for a neural transducer. Erdmann and Habash (2018) investigated the benefits of different similarity models as they apply to Arabic dialects. Their findings demonstrated that Word2Vec embeddings significantly underperformed in comparison to the Levenshtein distance baseline.…”
Section: Previous Workmentioning
confidence: 99%