Proceedings of the Eighteenth Conference on Computational Natural Language Learning 2014
DOI: 10.3115/v1/w14-1604
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Transliteration of Romanized Dialectal Arabic

Abstract: In this paper, we address the problem of converting Dialectal Arabic (DA) text that is written in the Latin script (called Arabizi) into Arabic script following the CODA convention for DA orthography. The presented system uses a finite state transducer trained at the character level to generate all possible transliterations for the input Arabizi words. We then filter the generated list using a DA morphological analyzer. After that we pick the best choice for each input word using a language model. We achieve a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
21
0

Year Published

2015
2015
2019
2019

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 40 publications
(21 citation statements)
references
References 15 publications
0
21
0
Order By: Relevance
“…It replaces a sound with a Latin letter or a group of Latin letters. This mainly depends on language-specific assumptions about grapheme-to-phoneme mapping [1]. The same is true for the Tunisian Dialect, which has no standard Arabic-script orthography.…”
Section: The Tunisian Dialect Spontaneous Orthographymentioning
confidence: 99%
See 2 more Smart Citations
“…It replaces a sound with a Latin letter or a group of Latin letters. This mainly depends on language-specific assumptions about grapheme-to-phoneme mapping [1]. The same is true for the Tunisian Dialect, which has no standard Arabic-script orthography.…”
Section: The Tunisian Dialect Spontaneous Orthographymentioning
confidence: 99%
“…We noticed that a Dialect word can be written in several ways because in cases where there is no standard orthography, people use a spontaneous orthography that is based on various criteria [1]. The main criterion is phonology.…”
Section: The Tunisian Dialect Spontaneous Orthographymentioning
confidence: 99%
See 1 more Smart Citation
“…El-Kahky et al (2011) use graph reinforcement models to learn mapping between characters in different scripts in the context of transliteration mining. Al-Badrashiny et al (2014) present a similar system where words are transcribed using a finite state transducer constructed from an aligned parallel Arabizi-Arabic corpus. The disadvantage of this and other learned methods (Ristad and Yianilos, 1998;Lin and Chen, 2002;Mangu and Brill, 1997;Jiampojamarn et al, 2009) is that they require aligned parallel corpora whereas our approach performs well without any data.…”
Section: Related Workmentioning
confidence: 99%
“…Transliteration Transliteration systems invariably use either a direct mapping between substrings in the two scripts (Al-Onaizan and Knight, 2002;AbdulJaleel and Larkey, 2003;El-Kahky et al, 2011;Al-Badrashiny et al, 2014), or use an intermediate script such as the International Phonetic Alphabet (IPA) (Brawer et al, 2010) or Double Metaphones (Philips, 2000). AbdulJaleel and Larkey (2003) proposed a statistical method for transliteration between Arabic and English.…”
Section: Related Workmentioning
confidence: 99%