2015
DOI: 10.14257/ijhit.2015.8.3.07
|View full text |Cite
|
Sign up to set email alerts
|

The KNN based Uyghur Text Classification and its Performance Analysis

Abstract: This paper takes the automatic classification of the large-scale Uyghur text collected from the network as research background, designed the functional block structure of the Uyghur text classification system, and chose the KNN algorithm as the classification engine, and programmed the classification system using C sharp. In the preprocessing part, combining with the Uyghur language's lexical characteristics, we introduced the stem extraction method into the procedure, and then have greatly reduced the whole f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2018
2018
2019
2019

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 1 publication
0
3
0
Order By: Relevance
“…Previous works [4] on stem extraction for Uyghur and Kazakh texts are mostly based on simple suffix-based stemming methods and some simple hand-crafted rules, which suffer from ambiguity, particularly on the short texts. Sentence or longer context-based reliable stem extraction methods can extract stems and terms accurately in noisy texts for Uyghur and Kazakh texts on a sentence level, and lead to the ambiguity reduction in a noisy text environment.…”
Section: Stemmentioning
confidence: 99%
See 2 more Smart Citations
“…Previous works [4] on stem extraction for Uyghur and Kazakh texts are mostly based on simple suffix-based stemming methods and some simple hand-crafted rules, which suffer from ambiguity, particularly on the short texts. Sentence or longer context-based reliable stem extraction methods can extract stems and terms accurately in noisy texts for Uyghur and Kazakh texts on a sentence level, and lead to the ambiguity reduction in a noisy text environment.…”
Section: Stemmentioning
confidence: 99%
“…Some works on Uyghur and Kazakh text classification have been reported in [4,9,10]. Tuerxun et al [4] used KNN (K-Nearest Neighbor) as a classifier on Uyghur text to conduct text classification, and used the TFIDF (Term Frequency-Inverse Document Frequency) algorithm to calculate the feature weight in this paper.…”
Section: Stemmentioning
confidence: 99%
See 1 more Smart Citation