2013
DOI: 10.6109/jicce.2013.11.4.268
|View full text |Cite
|
Sign up to set email alerts
|

Document Classification Model Using Web Documents for Balancing Training Corpus Size per Category

Abstract: In this paper, we propose a document classification model using Web documents as a part of the training corpus in order to resolve the imbalance of the training corpus size per category. For the purpose of retrieving the Web documents closely related to each category, the proposed document classification model calculates the matching score between word features and each category, and generates a Web search query by combining the higher-ranked word features and the category title. Then, the proposed document cl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0
2

Year Published

2013
2013
2020
2020

Publication Types

Select...
7

Relationship

2
5

Authors

Journals

citations
Cited by 15 publications
(5 citation statements)
references
References 16 publications
0
3
0
2
Order By: Relevance
“…In general, conventional methods such as telephone calls, mails, and online/offline surveys have been used. Social media including bulletin boards on the internet and SNS has recently become an important source of collecting VOC data [8].…”
Section: Discussionmentioning
confidence: 99%
“…In general, conventional methods such as telephone calls, mails, and online/offline surveys have been used. Social media including bulletin boards on the internet and SNS has recently become an important source of collecting VOC data [8].…”
Section: Discussionmentioning
confidence: 99%
“…In addition, it was confirmed that the step of Parse, Compile causes a bottleneck of a large-scale application or framework. For mobile, parse, Compile time takes about 2 to 5 times more than desktop environment [6].…”
Section: Javascriptmentioning
confidence: 99%
“…Support Vector Machine (SVM) is one of the statistical learning theories and have been recognized as one of the effective classification methods as compared to supervised machine learning algorithms [18]. Also, it can be introduced into solving pattern recognition problems with small samples and learning problems such as function estimation [19].…”
Section: Machine Learningmentioning
confidence: 99%