2017
DOI: 10.1142/s021848851750009x
|View full text |Cite
|
Sign up to set email alerts
|

The Hybrid Filter Feature Selection Methods for Improving High-Dimensional Text Categorization

Abstract: The bag-of-words technique is often used to present a document in text categorization. However, for a large set of documents where the dimension of the bag-of-words vector is very high, text categorization becomes a serious challenge as a result of sparse data, over-fitting, and irrelevant features. A filter feature selection method reduces the number of features by eliminating irrelevant features from the bag-of-words vector. In this paper, we analyze the weak points and strong points of two filter feature se… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(3 citation statements)
references
References 19 publications
0
3
0
Order By: Relevance
“…The statistical comparison includes parametric and non-parametric. In this study, we used both t-test, a parametric statistical comparison, and Wilcoxon signed-ranks test, a nonparametric statistical comparison, as in [43], [54], and [55]. Ttest depends on t-value while Wilcoxon signed-ranks test depends on z-value.…”
Section: Tools-home Improvement Datasetmentioning
confidence: 99%
“…The statistical comparison includes parametric and non-parametric. In this study, we used both t-test, a parametric statistical comparison, and Wilcoxon signed-ranks test, a nonparametric statistical comparison, as in [43], [54], and [55]. Ttest depends on t-value while Wilcoxon signed-ranks test depends on z-value.…”
Section: Tools-home Improvement Datasetmentioning
confidence: 99%
“…The statistical comparison includes parametric and non‐parametric methods. In this study, we used both the t ‐test, a parametric statistical comparison, and the Mann–Whitney U test, a non‐parametric statistical comparison 54–56 . These pairwise statistical comparisons include the averages ( μnormals) of the evaluation results (SSE, MAE, MBRE, MIBRE, MdMRE, and RMSE) from the five‐fold cross‐validations of the four experimental datasets.…”
Section: Introductionmentioning
confidence: 99%
“…e news popularity prediction studied in this paper is to predict the number of pageviews or retweets that may be obtained after the news is released. It can help journalists to better evaluate the quality of news and rank news, so as to conduct news delivery more reasonably [17][18][19][20]. Online news popularity prediction is an extremely challenging task.…”
Section: Introductionmentioning
confidence: 99%