2017
DOI: 10.1016/j.eswa.2016.12.011
|View full text |Cite
|
Sign up to set email alerts
|

Active function Cross-Entropy Clustering

Abstract: Gaussian Mixture Models (GMM) have found many applications in density estimation and data clustering. However, the model does not adapt well to curved and strongly nonlinear data. Recently there appeared an improvement called AcaGMM (Active curve axis Gaussian Mixture Model), which fits Gaussians along curves using an EM-like (Expectation Maximization) approach.Using the ideas standing behind AcaGMM, we build an alternative active function model of clustering, which has some advantages over AcaGMM. In particul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
11
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(12 citation statements)
references
References 47 publications
1
11
0
Order By: Relevance
“…Evaluation of possible dissimilarity metrics for categorical data can be found in dos Santos and Zárate (2015), Bai et al (2011). To obtain a more flexible structure of clusters, one can also use hierarchical methods (Zhao and Karypis 2002), density-based clustering (Wen et al 2002) or model-based techniques (Spurek 2017;Spurek et al 2017). One of important publicly available tools for efficient clustering of high dimensional binary data is the Cluto package (Karypis 2002).…”
Section: Distance-based Clusteringmentioning
confidence: 99%
“…Evaluation of possible dissimilarity metrics for categorical data can be found in dos Santos and Zárate (2015), Bai et al (2011). To obtain a more flexible structure of clusters, one can also use hierarchical methods (Zhao and Karypis 2002), density-based clustering (Wen et al 2002) or model-based techniques (Spurek 2017;Spurek et al 2017). One of important publicly available tools for efficient clustering of high dimensional binary data is the Cluto package (Karypis 2002).…”
Section: Distance-based Clusteringmentioning
confidence: 99%
“…AcaGMM works well in practice; however, it has major limitations. First of all, the AcaGMM cost function does not necessarily decrease with iterations, which causes problems with the stop condition, see [39]. Since the method uses orthogonal projections and arc lengths, it is very hard to use AcaGMM for more complicated curves in higherdimensional spaces.…”
Section: Related Workmentioning
confidence: 99%
“…In [39], authors have constructed the afCEC (active function cross-entropy clustering) algorithm, which allows the clustering of data on sub-manifolds of ℝ d . The motivation comes from the observation that it is often profitable to describe nonlinear data by a smaller number of components with more complicated curved shapes to obtain a better fit of the data, see Fig.…”
Section: Introductionmentioning
confidence: 99%
“…In this paper, we introduce a semi-supervised clustering method, CEC-IB, based on partition-level side information. CEC-IB combines Cross-Entropy Clustering (CEC) [39,40,42], a model-based clustering technique, with the Information Bottleneck (IB) method [11,43] to build the smallest model that preserves the side information and provides a good model of the data distribution. In other words, CEC-IB automatically determines the required number of clusters to trade between model complexity, model accuracy, and consistency with the side information.…”
Section: Introductionmentioning
confidence: 99%