2018
DOI: 10.1007/978-981-10-8198-9_22
|View full text |Cite
|
Sign up to set email alerts
|

Analyzing and Preprocessing the Twitter Data for Opinion Mining

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 8 publications
0
3
0
Order By: Relevance
“…The common thing in many of the recent studies regarding sentiment analysis is the use of Twitter as the primary source of data [23] [24] [25]. It is owing to the fact that Twitter is a universal microblogging website and it allows the users to express their thoughts in limited characters which makes the preprocessing part easy for the researchers [26]. However, as discussed in Section I, machine learning-based sentiment classification for tweets encounters three problems namely, high sparsity, high-dimensional feature vectors and highly skewed classes.…”
Section: Related Workmentioning
confidence: 99%
“…The common thing in many of the recent studies regarding sentiment analysis is the use of Twitter as the primary source of data [23] [24] [25]. It is owing to the fact that Twitter is a universal microblogging website and it allows the users to express their thoughts in limited characters which makes the preprocessing part easy for the researchers [26]. However, as discussed in Section I, machine learning-based sentiment classification for tweets encounters three problems namely, high sparsity, high-dimensional feature vectors and highly skewed classes.…”
Section: Related Workmentioning
confidence: 99%
“…Some manually annotated tweets are shown below in. Preprocessing and Feature Extraction: The data collected from social media contains a lot of noise (25)(26)(27)(28). The data needs to be filtered to remove the noise and prepare it for machine classification before the machine processes the data.…”
Section: Tweet Length and Distributionmentioning
confidence: 99%
“…Thus, out of 4797, a total of 531 tweets were removed, leaving 4266 tweets for classification. The text is pre-processed by using various promising standard techniques of text mining (27,31). After removing the URLs, non-ASCII characters, the text was tokenized and stemmed using Tokenizer and Stemmer available in Weka Tool.…”
Section: Tweet Length and Distributionmentioning
confidence: 99%