Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP) 2014
DOI: 10.3115/v1/w14-3615
|View full text |Cite
|
Sign up to set email alerts
|

Arabic Spelling Correction using Supervised Learning

Abstract: In this work, we address the problem of spelling correction in the Arabic language utilizing the new corpus provided by QALB (Qatar Arabic Language Bank) project which is an annotated corpus of sentences with errors and their corrections. The corpus contains edit, add before, split, merge, add after, move and other error types. We are concerned with the first four error types as they contribute more than 90% of the spelling errors in the corpus. The proposed system has many models to address each error type on… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 15 publications
(11 citation statements)
references
References 4 publications
0
11
0
Order By: Relevance
“…As expected, following the Advanced mode, our three annotators could annotate an average of 618.93 words per hour which is double those annotated in the Basic mode (only 302.14 words). Adding 12 The guidelines are available upon request.…”
Section: Annotation Analysis and Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…As expected, following the Advanced mode, our three annotators could annotate an average of 618.93 words per hour which is double those annotated in the Basic mode (only 302.14 words). Adding 12 The guidelines are available upon request.…”
Section: Annotation Analysis and Resultsmentioning
confidence: 99%
“…Techniques and tools reported in the literature for supporting the Arabic spelling errors detection and correction task include morphological analysis [12] [16], finite state transducer with edit distance [9] [8], statistical character level transformation [14], N-gram scores [17] [8], conditional random fields [14] [8], and Naïve Base [15]. Similar to systems described in the literature, Arib utilizes language resources such as dictionaries and corpora as well as the application of different techniques to support the task of Arabic spelling error detection and correction.…”
Section: Related Workmentioning
confidence: 99%
“…Error detection: Techniques used in the literature for detecting Arabic spelling errors are essentially based on two approaches: the language rules (AlShenaifi et al, 2015;Shaalan et al, 2010;Hassan et al, 2014) or a dictionary (Attia et al, 2014;Zerrouki et al, 2014;. For the first technique, detecting whether a word is misspelled or not depends on morphological analyzers.…”
Section: Auto-correction Of Hamozamentioning
confidence: 99%
“…Error detection: Techniques used in the literature for detecting Arabic spelling errors are essentially based on two approaches: the language rules (AlShenaifi et al, 2015;Shaalan et al, 2010;Hassan et al, 2014) or a dictionary (Attia et al, 2014;Zerrouki et al, 2014;Alkanhal et al, 2012). For the first technique, detecting whether a word is misspelled or not depends on morphological analyzers.…”
Section: Auto-correction Of Hamozamentioning
confidence: 99%