2009
DOI: 10.1002/asi.21173
|View full text |Cite
|
Sign up to set email alerts
|

Feature reduction techniques for Arabic text categorization

Abstract: This paper presents and compares three feature reduction techniques that were applied to Arabic text. The techniques include stemming, light stemming, and word clusters. The effects of the aforementioned techniques were studied and analyzed on the K-nearest-neighbor classifier. Stemming reduces words to their stems. Light stemming, by comparison, removes common affixes from words without reducing them to their stems. Word clusters group synonymous words into clusters and each cluster is represented by a single… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
32
0

Year Published

2011
2011
2022
2022

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 53 publications
(32 citation statements)
references
References 11 publications
0
32
0
Order By: Relevance
“…He claimed that experiments proved that the stemming technique is not always effective for Arabic document categorization. His experiments [4] ' ' ' ' English SVM Toman et al [5] ' ' ' -English and Czech NB Chirawichitchai et al [9] -' ' ' Thai NB, DT, SVM Mesleh [10,11] ' ' -' Arabic SVM Duwairi et al [15] -' ' -Arabic KNN Kanan [14] -' ' -Arabic SVM, NB, RF Zaki et al [18] ' ' ' -Arabic KNN Al-Shargabi et al [12] -' --Arabic NB, SVM, J48 Khorsheed et al [16] -' -' Arabic KNN, NB, SVM, etc. Ababneh et al [17] ' ' --Arabic KNN…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…He claimed that experiments proved that the stemming technique is not always effective for Arabic document categorization. His experiments [4] ' ' ' ' English SVM Toman et al [5] ' ' ' -English and Czech NB Chirawichitchai et al [9] -' ' ' Thai NB, DT, SVM Mesleh [10,11] ' ' -' Arabic SVM Duwairi et al [15] -' ' -Arabic KNN Kanan [14] -' ' -Arabic SVM, NB, RF Zaki et al [18] ' ' ' -Arabic KNN Al-Shargabi et al [12] -' --Arabic NB, SVM, J48 Khorsheed et al [16] -' -' Arabic KNN, NB, SVM, etc. Ababneh et al [17] ' ' --Arabic KNN…”
Section: Related Workmentioning
confidence: 99%
“…Uysal et al [7] N/F -+ (often) Pomikálek et al [6] N/F + + Méndez et al [8] N/F --Chirawichitchai et al [9] N/F N/F N/F Song et al [4] N/F + + Toman et al [5] N/F + -Mesleh [10,11] N/F N/F -Duwairi et al [15] N/F N/F + Kanan [14] 2015 N/F N/F + Zaki et al [18] N/F N/F + Al-Shargabi et al [12] N/F + N/F Khorsheed et al [16] N/F N/F N/F Ababneh et al [17] N/F N/F -…”
Section: Nr Sr Lsmentioning
confidence: 99%
“…This method does not significantly improve TC results over the BoW method (Elberrichi and Abidi, 2012). Extracting the root of the word using stemming methods has also been used to enhance TC results (Duwairi et al, 2009;Kanaan et al, 2009;Syiam et al, 2006). Still other researchers have attempted to use words in their orthographic form (without stemming) in TC (Mesleh, 2007;Thabtah et al, 2009).…”
Section: Related Workmentioning
confidence: 99%
“…al [18], compared three dimensionality reduction techniques; stemming, light stemming, and word cluster. The purpose of employing the previous methods is to reduce the size of documents vectors without affecting the accuracy of the classifiers.…”
Section: Related Workmentioning
confidence: 99%