2015
DOI: 10.1177/0165551515594722
|View full text |Cite
|
Sign up to set email alerts
|

Semantically enhanced pseudo relevance feedback for Arabic information retrieval

Abstract: The conventional information retrieval (IR) framework consists of four primary phases, namely, pre-processing, indexing, querying and retrieving results. Some phases of the current Arabic IR (AIR) framework have several drawbacks. This research aims to enhance an AIR by improving the processes in a conventional IR framework. We introduce an enhanced stop-word list in the pre-processing level and investigate several Arabic stemmers. In addition, an Arabic WordNet was utilized in the corpus and query expansion l… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(17 citation statements)
references
References 26 publications
0
17
0
Order By: Relevance
“…The main contribution consists of using word’s part-of-speech to select the appropriate synonyms. More recently, Atwan et al [9] presented an automatic corpus-based expansion technique combining AWN and corpus-based semantic similarity to select expansion terms. The results showed that the automatic expansion technique enhances the accuracy of Arabic IR on TREC 2001 dataset.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The main contribution consists of using word’s part-of-speech to select the appropriate synonyms. More recently, Atwan et al [9] presented an automatic corpus-based expansion technique combining AWN and corpus-based semantic similarity to select expansion terms. The results showed that the automatic expansion technique enhances the accuracy of Arabic IR on TREC 2001 dataset.…”
Section: Related Workmentioning
confidence: 99%
“…Ideally, however, expansion terms should be selected based on their similarity to query terms as well as their distribution in the set of pseudo-relevant documents. Some studies have indeed proposed to do so with mutual information [810]. We propose here to use word embedding for this task and focus on the Arabic language.…”
Section: Introductionmentioning
confidence: 99%
“…The preprocessing steps are done on the document terms before building the index and on the user query before matching process. The preprocessing should be done first to gain the benefit of speeding-up the retrieval time [18,19]. The preprocessing steps involve tokenization, removal of stop-words and stemming.…”
Section: A Preprocessingmentioning
confidence: 99%
“…These words don't give any hint for the content of their documents. In information retrieval systems, stop-words should be eliminated (by referring to a stop-word list) from the query text and from the set of index terms [18,20]. Figure 3 shows the tokens of a document title after removing the stop-words.…”
Section: ) Removal Of Stop-wordsmentioning
confidence: 99%
See 1 more Smart Citation