2009
DOI: 10.1007/s00357-009-9037-9
|View full text |Cite
|
Sign up to set email alerts
|

The Remarkable Simplicity of Very High Dimensional Data: Application of Model-Based Clustering

Abstract: An ultrametric topology formalizes the notion of hierarchical structure. An ultrametric embedding, referred to here as ultrametricity, is implied by a hierarchical embedding. Such hierarchical structure can be global in the data set, or local. By quantifying extent or degree of ultrametricity in a data set, we show that ultrametricity becomes pervasive as dimensionality and/or spatial sparsity increases. This leads us to assert that very high dimensional data are of simple structure. We exemplify this finding … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
33
0

Year Published

2010
2010
2015
2015

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 34 publications
(36 citation statements)
references
References 27 publications
3
33
0
Order By: Relevance
“…Several authors, such as [44,75], have shown that, in the context of clustering, high-dimensional spaces do have useful characteristics which ease the classification of data in those spaces. In particular, Scott and Thompson [88] showed that high-dimensional spaces are mostly empty.…”
Section: The Blessing Of Dimensionality In Clusteringmentioning
confidence: 99%
“…Several authors, such as [44,75], have shown that, in the context of clustering, high-dimensional spaces do have useful characteristics which ease the classification of data in those spaces. In particular, Scott and Thompson [88] showed that high-dimensional spaces are mostly empty.…”
Section: The Blessing Of Dimensionality In Clusteringmentioning
confidence: 99%
“…as an asymptotic lower bound on the first term of Eq. (24). Clearly, as ξ → 0, the first term in Eq.…”
Section: Proofmentioning
confidence: 93%
“…Affine Hulls, on the other hand, give a loose approximation of class regions as they model each class as an affine subspace. Therefore NAH classification may prove a promising instance based classifier in HDLSS settings where the notion of a neighborhood breaks down [9,11,17].…”
Section: Efficient Nearest Affine Hull Classificationmentioning
confidence: 99%
“…Murtagh [17] showed that ultrametricity becomes pervasive as dimensionality and spatial sparsity increases and used this property in model based clustering. Klement et al [15] proved that, for d → ∞, random and non-random scenarios are not distinguishable by any metric such that distances become approximately equal.…”
Section: Related Workmentioning
confidence: 99%