2013
DOI: 10.1109/tkde.2012.27
|View full text |Cite
|
Sign up to set email alerts
|

Dirichlet Process Mixture Model for Document Clustering with Feature Partition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
28
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 53 publications
(28 citation statements)
references
References 24 publications
0
28
0
Order By: Relevance
“…In this part, we compare the performance of GSDMM with K-means [13], HAC [15], and DMAFP [11]. K-means and HAC are two popular similarity-based clustering models.…”
Section: Comparison Of Clustering Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…In this part, we compare the performance of GSDMM with K-means [13], HAC [15], and DMAFP [11]. K-means and HAC are two popular similarity-based clustering models.…”
Section: Comparison Of Clustering Modelsmentioning
confidence: 99%
“…In the experimental study, we compared GSDMM with Kmeans [13], the Hierarchical Agglomerative Clustering (HAC) model [15], and DMAFP [11]. We did not choose other clustering methods like Gaussian Mixture Model [5], Affinity Propagation [8], and Spectral clustering [18], because they are not scalable with high-dimensional and large-volume data like texts.…”
Section: Introductionmentioning
confidence: 99%
“…Yu et al [40] and Huang et al [13] propose a Dirichlet process mixture with feature selection model (DPMFS) and a Dirichlet process mixture with feature partition model (DPMFP) for normal document clustering, respectively. They compare DPMFP with four other clustering models: EM text classification (EM-TC) [25], K-means [16], LDA [3] and exponential-family approximation of the Dirichlet compound multinomial distribution (EDCM) [8]; they find that DPMFP performs best.…”
Section: User Clustering and Text Clusteringmentioning
confidence: 99%
“…However, such data encryptions render effective data utilization a very challenging task due to the basic reason that there could be a large amount of outsourced data files. Not only that, many a times, in Cloud Computing, data owners may share their complete outsourced data with a large number of users, who actually might want to retrieve only certain specific data files that they are interested in during a given session (Ming et al, 2011;Murugesan, 2011;Huang et al, 2013;Song et al, 2000;Wang et al, 2010;2012;Witten et al, 1999;Yu et al, 2013).…”
Section: Ajasmentioning
confidence: 99%