2019
DOI: 10.1109/access.2019.2908420
|View full text |Cite
|
Sign up to set email alerts
|

Discriminative Feature Spamming Technique for Roman Urdu Sentiment Analysis

Abstract: Term weighting is one of the most commonly used approaches, which works by assigning weights to terms, that aims to improve the performance of information retrieval or text categorization tasks. In this paper, we present a novel term weighting technique, called discriminative feature spamming technique (DFST), which identifies distinctive terms, based on a term utility criteria (TUC), and then spams them to increase their discriminative power. The experimental results show that the DFST outperformed a set of t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 34 publications
(18 citation statements)
references
References 43 publications
0
14
0
Order By: Relevance
“…This section of the study outlines the studies that discuss the different shortcomings and opportunities for the area of Urdu-based SA. The existing literature regarding Urdu-based sentiment analysis, when summarized, reflects the various challenges such as non-availability of open access Urdu corpora ( Raza et al, 2019 ), limitation of existing language constructs for the Urdu language like (slangs and emotions-based sentences) ( Marrese-Taylor, Velásquez & Bravo-Marquez, 2014 ), a limited collection of negation and revised modifiers ( Mehmood et al, 2019a ). The need for more domain-centric words for Urdu language ( Khattak et al, 2021 ; Hasan et al, 2018 ).…”
Section: Assessment and Discussion Of Research Questionsmentioning
confidence: 99%
“…This section of the study outlines the studies that discuss the different shortcomings and opportunities for the area of Urdu-based SA. The existing literature regarding Urdu-based sentiment analysis, when summarized, reflects the various challenges such as non-availability of open access Urdu corpora ( Raza et al, 2019 ), limitation of existing language constructs for the Urdu language like (slangs and emotions-based sentences) ( Marrese-Taylor, Velásquez & Bravo-Marquez, 2014 ), a limited collection of negation and revised modifiers ( Mehmood et al, 2019a ). The need for more domain-centric words for Urdu language ( Khattak et al, 2021 ; Hasan et al, 2018 ).…”
Section: Assessment and Discussion Of Research Questionsmentioning
confidence: 99%
“…Manual Annotation, Auto-annotation and Semi Auto-annotation. In this study, manual annotation was performed to label the reviews into their targeted class according to the guidelines presented in [44] and [45]. Each review was labelled with one of the two classes-Positive or negative.…”
Section: Corpus Generationmentioning
confidence: 99%
“…While bag-of-words based feature representation approaches face the problem of data sparsity, randomly initialized word embeddings do not capture the semantics of text and fail to overshadow the performance of pre-trained neural word embeddings. Moreover, most of the existing work [19], [54], [20], [61], [55], [59], [56], [57] employs machine learning based methodologies, only Ghulam et al [62] have utilized a deep learning based approach for the task of Roman Urdu sentiment analysis. Also, no public benchmark dataset is available for Roman Urdu sentiment analysis…”
Section: Contributions: a Review Of Novel Work With Anticipated Imentioning
confidence: 99%
“…They reported that WordNet and TextBlob were highly accurate in word sense disambiguation and largely assisted the classifier to detect polarity in political reviews. Mehmood et al [51] presented a novel feature representation approach namely "Discriminative Feature Spamming" for Roman Urdu sentiment analysis. They compared the performance of the presented approach with TF, Binary Weighting, TF-IDF with word and character level features using Naive Bayes, Logistic Regression, majority voting, weighted voting, and multi-layer perceptron.…”
Section: Roman Urdu Sentiment Analysismentioning
confidence: 99%