2012
DOI: 10.5120/8400-2168
|View full text |Cite
|
Sign up to set email alerts
|

Towards Arabic Spell-Checker Based on N-Grams Scores

Abstract: The main purpose of this paper is to develop a simple and flexible spell-checker for Arabic language. The proposed spell-checker is based on N-Grams scores. For this purpose, eleven matrices are built to present the combination between the Arabic letters word. Each matrix concerns in the connection between a 2-grams letters. Each cell in the generarated matrix is assigned an integer value 2, 1 or 0. The cell is assigned the value 2 in the corresponding matrix; if the word is ended by these two letter and assig… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2014
2014
2017
2017

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(10 citation statements)
references
References 7 publications
0
10
0
Order By: Relevance
“…Techniques and tools reported in the literature for supporting the Arabic spelling errors detection and correction task include morphological analysis [12] [16], finite state transducer with edit distance [9] [8], statistical character level transformation [14], N-gram scores [17] [8], conditional random fields [14] [8], and Naïve Base [15]. Similar to systems described in the literature, Arib utilizes language resources such as dictionaries and corpora as well as the application of different techniques to support the task of Arabic spelling error detection and correction.…”
Section: Related Workmentioning
confidence: 99%
“…Techniques and tools reported in the literature for supporting the Arabic spelling errors detection and correction task include morphological analysis [12] [16], finite state transducer with edit distance [9] [8], statistical character level transformation [14], N-gram scores [17] [8], conditional random fields [14] [8], and Naïve Base [15]. Similar to systems described in the literature, Arib utilizes language resources such as dictionaries and corpora as well as the application of different techniques to support the task of Arabic spelling error detection and correction.…”
Section: Related Workmentioning
confidence: 99%
“…In [1], Muaidi et al divide each word into bigrams to develop an Arabic spell-checker. Each bigram is given a score.…”
Section: Arabic Normalizationmentioning
confidence: 99%
“…Each tweet is limited to 140 characters 1 and can comprise text, links, symbols, videos, and, or pictures. Users in Twitter may have both a following and followers, thereby forming a social network.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The corpus consists of 101,987 word types. It is implemented by a visual basic tool that browse daily newspapers and articles via web site [6].…”
Section: Corpus Creationmentioning
confidence: 99%