2012
DOI: 10.1186/1687-6180-2012-233
|View full text |Cite
|
Sign up to set email alerts
|

LDA boost classification: boosting by topics

Abstract: AdaBoost is an efficacious classification algorithm especially in text categorization (TC) tasks. The methodology of setting up a classifier committee and voting on the documents for classification can achieve high categorization precision. However, traditional Vector Space Model can easily lead to the curse of dimensionality and feature sparsity problems; so it affects classification performance seriously. This article proposed a novel classification algorithm called LDABoost based on boosting ideology which … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 15 publications
0
5
0
Order By: Relevance
“…Mallet is a library of Java code for machine learning applied to text applications developed by Andrew McCallum. The number of topics for each dataset is obtained according to equation (16). For LDA estimation of the training set, the inputs are (number of iterations = 2000, α= 50/K, β= 0.1), and for the LDA prediction, the number of the given resampling iterations is 1000.…”
Section: Experiments Settingsmentioning
confidence: 99%
See 1 more Smart Citation
“…Mallet is a library of Java code for machine learning applied to text applications developed by Andrew McCallum. The number of topics for each dataset is obtained according to equation (16). For LDA estimation of the training set, the inputs are (number of iterations = 2000, α= 50/K, β= 0.1), and for the LDA prediction, the number of the given resampling iterations is 1000.…”
Section: Experiments Settingsmentioning
confidence: 99%
“…A related study was conducted by Lei et al [16] to use LDA as a feature representation method for TC based on the binary version of AdaBoost algorithm and using Naive Bayes as a weak learner for AdaBoost. However, Naive Bayes works based on the feature frequencies, while representing the documents as latent topics means that each document is represented as a small number of weighted and unique topics.…”
Section: Related Workmentioning
confidence: 99%
“…The results show that this improved random forests outperformed the popular text classification methods such as Naïve bayes, SVM, KNN, RF in terms of classification performance, it gave an f-score up to 91%. Lei, Qiao, Qimin & Qitao (2012) perform topic text categorization using LDAboost ensemble method learning. The experimental result showed that LDAboost increase the performance from 73.3% to 90%.…”
Section: Approaches Using Ensemble Learning Methodsmentioning
confidence: 99%
“…A second classifier is then created after it, to focus on the instances in the training data that the first classifier got wrong. The process continues to add classifiers until a limit is reached in the number of models or accuracy (Lei, Qiao, Qimin & Qitao, 2012).…”
Section: Boostingmentioning
confidence: 99%
“…The proposed method was based on boosting algorithms for multilabel multiclass text categorization, outperforming text classifiers based on TF-IDF [12] and naive Bayes. The use of LDA-based features in boosting algorithms was introduced by La et al [30]. The method, named LDABoost, uses latent topics extracted from one LDA model as text features.…”
Section: Related Workmentioning
confidence: 99%