2011
DOI: 10.4236/jsea.2011.49060
|View full text |Cite
|
Sign up to set email alerts
|

The Enhancement of Arabic Stemming by Using Light Stemming and Dictionary-Based Stemming

Abstract: Word stemming is one of the most important factors that affect the performance of many natural language processing applications such as part of speech tagging, syntactic parsing, machine translation system and information retrieval systems. Computational stemming is an urgent problem for Arabic Natural Language Processing, because Arabic is a highly inflected language. The existing stemmers have ignored the handling of multi-word expressions and identification of Arabic names. We used the enhanced stemming for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 9 publications
(5 citation statements)
references
References 9 publications
0
5
0
Order By: Relevance
“…Additionally, this method cannot model non-contiguous roots, of which Semitic languages have many. Other unsupervised methods utilize dictionaries to select the characters from within words (Darwish, 2002;Boudlal et al, 2011;Alhanini and Ab Aziz, 2011). Another line of research leverages the templatic nature for human-constructed rule-based constraints (Elghamry, 2005;Rodrigues and Cavar, 2007;Choueka, 1990).…”
Section: Related Workmentioning
confidence: 99%
“…Additionally, this method cannot model non-contiguous roots, of which Semitic languages have many. Other unsupervised methods utilize dictionaries to select the characters from within words (Darwish, 2002;Boudlal et al, 2011;Alhanini and Ab Aziz, 2011). Another line of research leverages the templatic nature for human-constructed rule-based constraints (Elghamry, 2005;Rodrigues and Cavar, 2007;Choueka, 1990).…”
Section: Related Workmentioning
confidence: 99%
“…approaches for root-andpattern morphology involve rule-based root extraction (Khoja and Garside, 1999;Taghva et al, 2005;Ababneh et al, 2012;El-Beltagy and Rafea, 2011;Alhanini et al, 2011;Al-Shalabi and Evens, 1998), and others that require training data of word-root pairs (Attia et al, 2016;Al-Serhan and Ayesh, 2006), all developed for processing Arabic. The resource-intensive nature of these methods not only limits their wider applicability to other Semitic languages but also prevents from handling the productive process more generally.…”
Section: Currentmentioning
confidence: 99%
“…The study found the hybrid approach to have an average accuracy of approximately 89.3%. A rule-based stemmer for Bengali language which uses stem dictionary for further validation was developed in [8], whereas an enhanced stemmer for Arabic was built by integrating light and dictionary based stemming [9]. The latter study found their stemmer to produce an average accuracy of approximately 96%.…”
Section: Stemmingmentioning
confidence: 99%