2014
DOI: 10.7763/ijmlc.2014.v4.438
|View full text |Cite
|
Sign up to set email alerts
|

A Feature Selection Method for Twitter News Classification

Abstract: Abstract-This paper presence a new feature selection method which can be used for creating data set in order to classify Twitter short messages. The Twitter short messages contain only 140 characters. Thus, the number of words per sentence is almost equal for all sentences. Once you pool the all text messages together, there can be number of words in the pool but, for a given sentence, there will be only few words included from the pool. This causes to have a sparse matrix as the feature vector. By removing th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(3 citation statements)
references
References 6 publications
0
3
0
Order By: Relevance
“…Chi-square method used for feature selection helps to reduce the dimensionality as well as noise in data and increase the accuracy of classifier from 81.5 to 92.3 in TF-IDF, 83 to 90 in FF and 83.1 to 93 in FP [21]. Number of feature selected highly effect the accuracy of result but after a limit increase in size of feature set does not increase classification accuracy [22]- [25].…”
Section: E Feature Selection Methods and Number Of Features Selectedmentioning
confidence: 99%
“…Chi-square method used for feature selection helps to reduce the dimensionality as well as noise in data and increase the accuracy of classifier from 81.5 to 92.3 in TF-IDF, 83 to 90 in FF and 83.1 to 93 in FP [21]. Number of feature selected highly effect the accuracy of result but after a limit increase in size of feature set does not increase classification accuracy [22]- [25].…”
Section: E Feature Selection Methods and Number Of Features Selectedmentioning
confidence: 99%
“…Patil et al [11] in their research implemented SVM with and without feature extraction and show that SVM eliminated the need for feature selection due to the ability to generalize high dimension feature space. Inoshika et al [12] research on feature ranking and selection techniques for Twitter data opinion mining and suggested to remove unrelated words from feature space to reduce dimensionality that further reduces the sparseness of the feature set. The research also proposed a new feature selection technique on the basis of information theory named as Ratio Method.…”
Section: Research Strategy Designmentioning
confidence: 99%
“…For example, the text classi cation models based on the support vector machine (SVM) classify news in each group of Twitter into positive and negative categories [18,19]. In further, more feature extraction methods are introduced to construct the text classification models based on SVM and its variants, which are applied for multitopic news classification [20,21]. In recent years, the quick development of deep neural networks promotes to study of these text classification models based on deep learning, which has become the mainstream in the field of text classification.…”
Section: Introductionmentioning
confidence: 99%