2015 2nd International Conference on Knowledge-Based Engineering and Innovation (KBEI) 2015
DOI: 10.1109/kbei.2015.7436031
|View full text |Cite
|
Sign up to set email alerts
|

PSWG: An automatic stop-word list generator for Persian information retrieval systems based on similarity function & POS information

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 7 publications
0
4
0
Order By: Relevance
“…Ref. [15] exploits Part-of-Speech information. These approaches cannot be directly compared to our proposal, in which we purposely start from plain text and avoid any kind of aid or pre-processing.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Ref. [15] exploits Part-of-Speech information. These approaches cannot be directly compared to our proposal, in which we purposely start from plain text and avoid any kind of aid or pre-processing.…”
Section: Related Workmentioning
confidence: 99%
“…As witnessed by the most recent survey paper available on Stopword Removal, Reference [4], the literature after Reference [20] mainly focused on specific and peculiar languages, especially those using non-Latin script. A list of such works (often published in National conferences or journals) includes Arabic [24][25][26], Chinese [27,28], Persian [15], Sanskrit [29], Gujarati [30], Punjabi [31], Hindi [32][33][34], Bengali [35], Sinhala [36], and Tamil [37]. Here, we aim at devising an approach that can be applied to different languages; thus, we will not discuss these works in the following, nor can we compare our proposal to these works, which use very tailored approaches.…”
mentioning
confidence: 99%
“…In a more recent study, [17], linguistic and syntactic information are aggregated to build stop-word list in Persian information retrieval systems. In [17], part of speech (POS) tags are employed together with statistical measures such as entropy and the method is assessed by precision. The precision values reported are in range [0.25 0.3] for the whole set of different POS tags.…”
Section: Related Workmentioning
confidence: 99%
“…Stop-words list was automatically generated for Egyptian dialect using frequency method [30]. The aggregate method was used for generation of stop-words list for Persian language by combining statistical and similarity function approaches [31]. A deterministic finite automaton was used for generation of stopwords for Hindi text [32].…”
Section: Introductionmentioning
confidence: 99%