Towards enhancing retrieval effectiveness of search engines for diacritisized Arabic documents

Hammo, Bassam

doi:10.1007/s10791-008-9081-9

Cited by 33 publications

(21 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In his study for the Holy Quran, Hammo [24] stated that most of the failing cases of Khoja when it was used to stem words of the Holy book, were occurred when stemming proper names such as the names of Prophets, angels, ancient cities, places and people, numerals, as well as words with the diacritical mark sha-dda.…”

Section: Root-based and Morphological Analyzersmentioning

confidence: 99%

A Comparative Survey on Arabic Stemming: Approaches and Challenges

Mustafa¹,

Eldeen²,

Bani‐Ahmad³

et al. 2017

IIM

View full text Add to dashboard Cite

Arabic, as one of the Semitic languages, has a very rich and complex morphology, which is radically different from the European and the East Asian languages. The derivational system of Arabic, is therefore, based on roots, which are often inflected to compose words, using a spectacular and a relatively large set of Arabic morphemes affixes, e.g., antefixs, prefixes, suffixes, etc. Stemming is the process of rendering all the inflected forms of word into a common canonical form. Stemming is one of the early and major phases in natural processing, machine translation and information retrieval tasks. A number of Arabic language stemmers were proposed. Examples include light stemming, morphological analysis, statistical-based stemming, N-grams and parallel corpora (collections). Motivated by the reported results in the literature, this paper attempts to exhaustively review current achievements for stemming Arabic texts. A variety of algorithms are discussed. The main contribution of the paper is to provide better understanding among existing approaches with the hope of building an error-free and effective Arabic stemmer in the near future.

show abstract

Section: Root-based and Morphological Analyzersmentioning

confidence: 99%

A Comparative Survey on Arabic Stemming: Approaches and Challenges

Mustafa¹,

Eldeen²,

Bani‐Ahmad³

et al. 2017

IIM

View full text Add to dashboard Cite

show abstract

“…Stemming is the process of correlating several terms onto one common representation in the base form [16]. It minimizes the index size because it has the advantage of reducing storage requirements by eliminating the redundant words.…”

Section: Introductionmentioning

confidence: 99%

“…Stemming uses morphological heuristics in order to remove affixes from words and the processing cost is relatively low. For those reasons, the stemming is important and highly attractive for many natural language processing (NLP) fields such as: information retrieval (IR), question answering (QA), information extraction (IE), machine translation (MT), text summarizations (TS), Text Classification (TC), Text Clustering (TClu), Text segmentation (TS), Indexing (Ind), and Automatic Speech Recognition (ASR) [16]. There are many developed algorithmic stemming and various morphological analysis approaches to achieve morphologically related forms combined under the same stem using stemmer [14].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A Comparative Study on Arabic Stemmers

Dahab¹,

Ibrahim²,

Al-Mutawa³

2015

IJCA

View full text Add to dashboard Cite

Stemming is considered as a pre-processing step in many applications: text mining, information retrieval, machine translation etc. The Arabic language has many special cases or properties that affect stemming or any automatic method, it depends on both inflectional and derivational morphology to produce the various forms of the language words. Many researchers have proposed algorithms to solve the problems of stemming. This paper aims to make a comparison study among the existing Arabic stemmers, the comparison study is based on the methodologies, the usage, main idea, algorithm, the affixes, limitations, output, and the stemmers' sensitivity for both diacritics and context. General TermsNatural Language Processing.

show abstract

“…The growing number of Arabic documents on the Web signals the need for advanced and improved Web search engines that retrieve related Arabic documents with high correctness and less time based on user requests. Precision, percentage of the retrieved related-documents and recall are measures used to determine the IR system's effectiveness and correctness (Abdelmgeid, 2007;Hammo, 2009).…”

Section: Introductionmentioning

confidence: 99%

Enhanced Search Scheme Precision and Performance using a GA Approach with Application to Arabic Content

Ghwanmeh¹

2012

JAC

View full text Add to dashboard Cite

Literature examination shows that information search engines in Arabic are few compared to those available in English and other languages. Additionally, search engines face many problems when programmed in the Arabic language, including difficulty and uncertainty. Employing Genetic Algorithm within the search scheme to improve performance and exactness and tackle issues with non-accurateness of search systems in which Arabic content is used can be considered an advancement. An enhanced search scheme that provides exactness, precision, and performance by applying the Genetic Algorithm Technique to Arabic content is presented in this paper. Based on the user starting page selection, the system employs its dynamic characteristics to search related pages on the Web. A series of experiments has been conducted to test the quality and effectiveness of the proposed system by means of well-known test-base collections -namely, CISI, CACM, and NPL -and 242 Arabic-content sites. General results revealed that the proposed system retrieved the largest number of appropriate documents and minimal non-related documents with respect to user requests in high-performance information retrieval systems that use the Genetic Algorithm.

show abstract

Towards enhancing retrieval effectiveness of search engines for diacritisized Arabic documents

Cited by 33 publications

References 35 publications

A Comparative Survey on Arabic Stemming: Approaches and Challenges

A Comparative Survey on Arabic Stemming: Approaches and Challenges

A Comparative Study on Arabic Stemmers

Enhanced Search Scheme Precision and Performance using a GA Approach with Application to Arabic Content

Contact Info

Product

Resources

About