2015
DOI: 10.1002/asi.23609
|View full text |Cite
|
Sign up to set email alerts
|

Automated arabic text classification with PStemmer, machine learning, and a tailored news article taxonomy

Abstract: Arabic news articles in electronic collections are difficult to work with. Browsing by category is rarely supported. While helpful machine learning methods have been applied successfully to similar situations for English news articles, limited research has been completed to yield suitable solutions for Arabic news. In connection with a QNRF funded project to build digital library community and infrastructure in Qatar, we developed software for browsing a collection of about 237K Arabic news articles, which sho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
29
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 47 publications
(29 citation statements)
references
References 22 publications
0
29
0
Order By: Relevance
“…He claimed that experiments proved that the stemming technique is not always effective for Arabic document categorization. His experiments [4] ' ' ' ' English SVM Toman et al [5] ' ' ' -English and Czech NB Chirawichitchai et al [9] -' ' ' Thai NB, DT, SVM Mesleh [10,11] ' ' -' Arabic SVM Duwairi et al [15] -' ' -Arabic KNN Kanan [14] -' ' -Arabic SVM, NB, RF Zaki et al [18] ' ' ' -Arabic KNN Al-Shargabi et al [12] -' --Arabic NB, SVM, J48 Khorsheed et al [16] -' -' Arabic KNN, NB, SVM, etc. Ababneh et al [17] ' ' --Arabic KNN…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…He claimed that experiments proved that the stemming technique is not always effective for Arabic document categorization. His experiments [4] ' ' ' ' English SVM Toman et al [5] ' ' ' -English and Czech NB Chirawichitchai et al [9] -' ' ' Thai NB, DT, SVM Mesleh [10,11] ' ' -' Arabic SVM Duwairi et al [15] -' ' -Arabic KNN Kanan [14] -' ' -Arabic SVM, NB, RF Zaki et al [18] ' ' ' -Arabic KNN Al-Shargabi et al [12] -' --Arabic NB, SVM, J48 Khorsheed et al [16] -' -' Arabic KNN, NB, SVM, etc. Ababneh et al [17] ' ' --Arabic KNN…”
Section: Related Workmentioning
confidence: 99%
“…The Arabic language is a native language of the Arab states and the secondary language in a number of other countries [19]. More than 422 million people are able to speak Arabic, which makes this language the fifth most spoken language in the world, according to [14]. The alphabet of the Arabic language consists of 28 letters:…”
Section: Overview Of Arabic Language Structurementioning
confidence: 99%
See 1 more Smart Citation
“…Several scholars have discussed the difficulties associated with developing natural language processing methods and algorithms for Arabic. These challenges include the ambiguity and complexity of Arabic (Kanan & Fox, 2016;Salloum, Al-emran, & Shaalan, 2016), the prevalence of several commonly used dialects in Arabic (Samih et al, 2017;Zalmout, Erdmann, & Habash, 2018), and the limited number of freely available datasets that can be used in the research and development for Arabic computational solutions (Zeroual & Lakhouaja, 2018). This study further investigates the complexity of Arabic and the problems associated with computational solutions that do not incorporate Arabic dialects.…”
Section: Arabic Natural Language Processingmentioning
confidence: 99%
“…It is relevant to note that in the domain of natural language processing, researchers often focus on developing solutions specific for a target language such as Arabic or Chinese for common tasks. These tasks include for example document classification and document summarization (Alanzi & Abuzeina, 2017;Al-Thubaity, Alhoshan, & Hazzaa, 2015;Kanan & Fox, 2016;Zhang, Xu, Su, & Xu, 2015). Therefore, focusing on novel methods to generate labels for images in Arabic is similarly important.…”
Section: Introductionmentioning
confidence: 99%