2021
DOI: 10.1155/2021/6645345
|View full text |Cite
|
Sign up to set email alerts
|

A New Big Data Feature Selection Approach for Text Classification

Abstract: Feature selection (FS) is a fundamental task for text classification problems. Text feature selection aims to represent documents using the most relevant features. This process can reduce the size of datasets and improve the performance of the machine learning algorithms. Many researchers have focused on elaborating efficient FS techniques. However, most of the proposed approaches are evaluated for small datasets and validated using single machines. As textual data dimensionality becomes higher, traditional FS… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(5 citation statements)
references
References 35 publications
0
5
0
Order By: Relevance
“…For labeled data, according to the traditional support vector machine (SVM) theory, the loss function is the hinge loss, formula (10), as shown in Figure 6(a).…”
Section: Application Of Nlp In Text Classificationmentioning
confidence: 99%
“…For labeled data, according to the traditional support vector machine (SVM) theory, the loss function is the hinge loss, formula (10), as shown in Figure 6(a).…”
Section: Application Of Nlp In Text Classificationmentioning
confidence: 99%
“…e training samples of unbalanced human resource data are smoothly updated under the state of genetic iteration, and the central point of the transmitted unbalanced data cluster is updated according to the principle of the small square function value. e obtained power spectrum density function of the transmitted unbalanced data is taken as the feature to optimize the features of the transmitted unbalanced data [15]. e process is as follows.…”
Section: Parallel Recognition Of Unbalanced Human Resource Datamentioning
confidence: 99%
“…For an FS procedure, the entire search space encompasses all potential subsections of features, and the degree of subsections is computed using the following equation: where n denotes the number of characteristics in the existing subset of features and z is the size of the entire feature subset. FS approaches typically necessitate heuristics or randomized search tactics, which add to the complexity of the resulting group, lowering its degree of optimization problem [ 25 ]. Depending on their evaluation approach, FS techniques can be divided into three categories.…”
Section: Methodsmentioning
confidence: 99%
“…where n denotes the number of characteristics in the existing subset of features and z is the size of the entire feature subset. FS approaches typically necessitate heuristics or randomized search tactics, which add to the complexity of the resulting group, lowering its degree of optimization problem [25].…”
Section: Phase Of Feature Selectionmentioning
confidence: 99%