2008
DOI: 10.1007/978-3-540-78246-9_16
|View full text |Cite
|
Sign up to set email alerts
|

The Noise Component in Model-based Cluster Analysis

Abstract: Abstract. The so-called noise-component has been introduced by Banfield and Raftery (1993) to improve the robustness of cluster analysis based on the normal mixture model. The idea is to add a uniform distribution over the convex hull of the data as an additional mixture component. While this yields good results in many practical applications, there are some problems with the original proposal: 1) As shown by Hennig (2004), the method is not breakdown-robust.2) The original approach doesn't define a proper ML … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 19 publications
(14 citation statements)
references
References 9 publications
0
14
0
Order By: Relevance
“…Good results of the use of this specific mixture model were observed in different situations. Hennig and Coretto [22] proposed recently to use an improper uniform distribution that does not depend on the data for improving the robustness and provide a better approximation of the likelihood than the one proposed in the original work. An application of noise detection is proposed in Section 4.…”
Section: Mixture Of Parsimonious Gaussiansmentioning
confidence: 99%
“…Good results of the use of this specific mixture model were observed in different situations. Hennig and Coretto [22] proposed recently to use an improper uniform distribution that does not depend on the data for improving the robustness and provide a better approximation of the likelihood than the one proposed in the original work. An application of noise detection is proposed in Section 4.…”
Section: Mixture Of Parsimonious Gaussiansmentioning
confidence: 99%
“…While each cluster is generally represented by one mixture component, propositions have been recently made to model each cluster by several mixture components to give the model more flexibility (Baudry et al, 2010). Non-clustering observations, if considered, are generally viewed as noise rather than as a component of interest (see, e.g., Dasgupta and Raftery, 1998;Hennig and Coretto, 2008). A case can be made for fitting mixture models in a Bayesian framework (Frühwirth-Schnatter, 2006;Fritsch and Ickstadt, 2009).…”
Section: Introductionmentioning
confidence: 98%
“…, K the regular components. Banfield & Raftery (1993) and Hennig & Coretto (2007) use a uniform distribution for f 0 , the main difference is that the former estimate the range of the uniform from the data, while the latter use either an improper uniform with pre-specified fixed value for the height of the density, or an ML estimate for the complete mixture including the noise component. Both consider only the case of model-based clustering, i.e., no regression.…”
Section: Modelling Background Noisementioning
confidence: 99%
“…Thus, we knock out outliers everywhere except for the main support regions of the regular components. For uniforms, we need to solve the ill-conditioned estimation problem of the boundaries of the uniform distribution, see Hennig & Coretto (2007) for a detailed discussion. For the Gaussians exact estimation of variance is not really critical (a rather unusual situation!…”
Section: Gaussian Responsementioning
confidence: 99%
See 1 more Smart Citation