2005
DOI: 10.1109/tpami.2005.95
|View full text |Cite
|
Sign up to set email alerts
|

Automated variable weighting in k-means type clustering

Abstract: This paper proposes a k-means type clustering algorithm that can automatically calculate variable weights. A new step is introduced to the k-means clustering process to iteratively update variable weights based on the current partition of data and a formula for weight calculation is proposed. The convergency theorem of the new clustering process is given. The variable weights produced by the algorithm measure the importance of variables in clustering and can be used in variable selection in data mining applica… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
387
0

Year Published

2008
2008
2017
2017

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 703 publications
(387 citation statements)
references
References 17 publications
0
387
0
Order By: Relevance
“…There are mainly two types. The first is text cluster [9][10][11][12][13][14]. The cluster analysis [9] is one of important method to realize text mining.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…There are mainly two types. The first is text cluster [9][10][11][12][13][14]. The cluster analysis [9] is one of important method to realize text mining.…”
Section: Introductionmentioning
confidence: 99%
“…Then the distance function of weighted characteristic comes into being. Another representation method is automatic weighted characteristic technique [12,13]. In the K-Means or FCM, the feature weight vector indicates importance of each feature on the whole data set.…”
Section: Introductionmentioning
confidence: 99%
“…Feature weighting can be done at the same time as the clustering itself. Feature weighting has received considerable attention in partitional clustering (Amorim and Mirkin, 2012;Amorim and Fenner, 2012;Chan et al, 2004;Huang et al, 2005Huang et al, , 2008Makarenkov and Legendre, 2001), but not so in hierarchical clustering. Surely, it is possible to apply a feature selection algorithm to a dataset before using the Ward method.…”
Section: Introductionmentioning
confidence: 99%
“…We have decided to use the L p norm because this transforms the weights into feature rescaling factors, in contrast to the work of Chan et al (2004), and Huang et al (2005Huang et al ( , 2008.…”
Section: Introductionmentioning
confidence: 99%
“…This essentially inductive view is that the disbenefit of noise and uncertainty generated by data led generalisation are outweighed by the greater risk of straightjacketing a classification to realise pre-ordained outcomes. Only recently have studies been conducted into how weighting schemes can be automated through an adaptation of the k means algorithm (Huang et al, 2005) Although some view PCA as useful to filter variables that may be redundant or have negative effects upon classification outcomes (Debenham et al, 2002), a contrary view is that the technique results in undesirable information loss and creates complexity in results which are difficult to interpret (Harris et al, 2005).…”
Section: Building the Bespoke He Geodemographic Classificationmentioning
confidence: 99%