2019
DOI: 10.1016/j.patcog.2019.01.042
|View full text |Cite
|
Sign up to set email alerts
|

Optimal mathematical programming and variable neighborhood search for k-modes categorical data clustering

Abstract: The conventional k-modes algorithm and its variants have been extensively used for categorical data clustering. However, these algorithms have some drawbacks, e.g. they can be trapped into local optima and sensitive to initial clusters/modes. Our numerical experiments even showed that the k-modes algorithm cannot identify the optimal clustering results for some special datasets regardless the selection of the initial centers. In this paper, for small-sized datasets we developed an optimal programming approach … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 20 publications
(13 citation statements)
references
References 43 publications
0
13
0
Order By: Relevance
“…The techniques of grouping multidimensional space observations are algorithms of cluster analysis, hierarchical and non‐hierarchical ones [15]. Some are used for clustering of categorical datasets, such as indicated and referenced in the works by Xiao et al [32], Kudová et al [14], or Delibašic [4]. Selected clustering approaches either in their original forms or in modified forms have been studied and practiced.…”
Section: Methodsmentioning
confidence: 99%
“…The techniques of grouping multidimensional space observations are algorithms of cluster analysis, hierarchical and non‐hierarchical ones [15]. Some are used for clustering of categorical datasets, such as indicated and referenced in the works by Xiao et al [32], Kudová et al [14], or Delibašic [4]. Selected clustering approaches either in their original forms or in modified forms have been studied and practiced.…”
Section: Methodsmentioning
confidence: 99%
“…For large-sized C-MCLP instances, an efficient heuristic approach was developed to obtain near-optimal solutions and achieve high computational efficiencies. The proposed approach is referred to as the MILP-based dynamic iterative partial optimization (MILP-DIPO for short), and it is based on the MILP-based neighborhood searching algorithms developed by Xiao et al (2016Xiao et al ( , 2019aXiao et al ( , 2019b. The underlying principle is as follows.…”
Section: A Milp-based Fix-and-optimize Heuristic Approachmentioning
confidence: 99%
“…Many clustering techniques have been proposed to overcome this problem of categorical grouping data, including to avoid the k-means constraint on data categorization, researchers use hard k-mode as a simple matching function [16]. Next, a new inequality measure is used to improve hard k-modes [17]- [19] and to create fuzzy k-modes [20]. Kim et al [21] demonstrated how to increase the efficiency of the fuzzy k-mode by converting it to a fuzzy centroid.…”
Section: Introductionmentioning
confidence: 99%