2013
DOI: 10.1177/0165551513507415
|View full text |Cite
|
Sign up to set email alerts
|

Utilizing global and path information with language modelling for hierarchical text classification

Abstract: Hierarchical text classification of a Web taxonomy is challenging because it is a very large-scale problem with hundreds of thousand categories and associated documents. Furthermore, the conceptual levels and training data availabilities of categories vary widely. The narrow-down approach is the state-of-the-art that utilizes a search engine for generating candidates from the taxonomy and builds a classifier for the final category selection. In this paper, we take the same approach but address the issue of usi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(11 citation statements)
references
References 28 publications
0
11
0
Order By: Relevance
“…Our key contribution is the derivation of new approaches for the category selection stage. For the robust understanding of our novel methods, we first give a quick summary of language models and introduce an existing category selection approach that utilizes global information aggressively [16] because the aggressive method outperforms the passive one. We then describe our proposed two category selection methods in Sections 3.3 and 3.4, respectively.…”
Section: Proposed Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…Our key contribution is the derivation of new approaches for the category selection stage. For the robust understanding of our novel methods, we first give a quick summary of language models and introduce an existing category selection approach that utilizes global information aggressively [16] because the aggressive method outperforms the passive one. We then describe our proposed two category selection methods in Sections 3.3 and 3.4, respectively.…”
Section: Proposed Methodsmentioning
confidence: 99%
“…The difference is that the former is to update centroids immediately after classification while the latter is after classifying all the training documents. We borrow the online update method used in Oh and Myaeng [16] and the best performing parameter setting [18] is chosen.…”
Section: Proposed Methodsmentioning
confidence: 99%
See 3 more Smart Citations