2014
DOI: 10.1016/j.ipm.2013.08.006
|View full text |Cite
|
Sign up to set email alerts
|

The impact of preprocessing on text classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
257
0
29

Year Published

2017
2017
2024
2024

Publication Types

Select...
9

Relationship

1
8

Authors

Journals

citations
Cited by 525 publications
(288 citation statements)
references
References 25 publications
2
257
0
29
Order By: Relevance
“…The stop words were removed, as they do not convey any meaningful information. Finally, the textual content of the tweets was converted to lowercase characters, as Uysal and Gunal (2014) showed that lowercase conversion is an effective pre-processing step. As a last step, the text in Hindi was translated to English.…”
Section: Data Pre-processingmentioning
confidence: 99%
“…The stop words were removed, as they do not convey any meaningful information. Finally, the textual content of the tweets was converted to lowercase characters, as Uysal and Gunal (2014) showed that lowercase conversion is an effective pre-processing step. As a last step, the text in Hindi was translated to English.…”
Section: Data Pre-processingmentioning
confidence: 99%
“…For this reason, they are, most of the time, assumed to be uninformative. However, there exists several efforts, which reveals this assumption is not always true [15]. As one can easily realize, stop-words are specific to the language.…”
Section: Preprocessing Methodsmentioning
confidence: 99%
“…Refs. [42,43] suggest that feature selection is a very important stage in addition to feature extraction and classification. The selected data are moved to the preprocessing module in order to transform data to suit the learning algorithms, ultimately resulting in quality output.…”
Section: Preprocessingmentioning
confidence: 99%
“…After this process, any classifier can implement the text classification process by predicting the label of the document. The research community working in this field is still studying how to improve the performance of text classification by combining various preprocessing [43,46], feature extraction [47], feature selection [42,48], and ensemble methods [49]. The following features are extracted for the proposed model:…”
Section: Feature Extraction and Selectionmentioning
confidence: 99%