2021
DOI: 10.3390/info12120523
|View full text |Cite
|
Sign up to set email alerts
|

A Comparative Study of Arabic Part of Speech Taggers Using Literary Text Samples from Saudi Novels

Abstract: Part of Speech (POS) tagging is one of the most common techniques used in natural language processing (NLP) applications and corpus linguistics. Various POS tagging tools have been developed for Arabic. These taggers differ in several aspects, such as in their modeling techniques, tag sets and training and testing data. In this paper we conduct a comparative study of five Arabic POS taggers, namely: Stanford Arabic, CAMeL Tools, Farasa, MADAMIRA and Arabic Linguistic Pipeline (ALP) which examine their performa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(2 citation statements)
references
References 13 publications
0
2
0
Order By: Relevance
“…The choice for ALP was made based on a comparative study of the five most common Arabic POS taggers, the Stanford Arabic tagger ( https://nlp.stanford.edu/software/tagger.shtml ), CAMeL Tools ( https://github.com/CAMeL-Lab/camel_tools ), MADAMIRA ( https://camel.abudhabi.nyu.edu/madamira ), Farasa ( https://farasa.qcri.org/POS ), and ALP ( http://arabicnlp.pro/ ). The study used text samples from Saudi novels and found that the ALP outperformed the other taggers ( Alluhaibi et al, 2021 ).…”
Section: Methodsmentioning
confidence: 99%
“…The choice for ALP was made based on a comparative study of the five most common Arabic POS taggers, the Stanford Arabic tagger ( https://nlp.stanford.edu/software/tagger.shtml ), CAMeL Tools ( https://github.com/CAMeL-Lab/camel_tools ), MADAMIRA ( https://camel.abudhabi.nyu.edu/madamira ), Farasa ( https://farasa.qcri.org/POS ), and ALP ( http://arabicnlp.pro/ ). The study used text samples from Saudi novels and found that the ALP outperformed the other taggers ( Alluhaibi et al, 2021 ).…”
Section: Methodsmentioning
confidence: 99%
“…In this step, the words of the corpus were annotated with POS tags using the free Arabic Linguistics Pipeline (ALP) tool [59] (The tool can be accessed via the following link: http://arabicnlp.pro/alp/index-ar.php, accessed on 12 April 2022). This tool was chosen objectively based on the results of a study by [60] that compared five well-known Arabic POS taggers (Arabic Stanford, CAMeL, Farasa, MADAMIRA, and ALP) using a sample from the current corpus. ALP performed better along all three considered metrics (Recall, Precision, and F-score).…”
Section: Word Annotationmentioning
confidence: 99%