2017
DOI: 10.48550/arxiv.1704.00454
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Clustering in Hilbert simplex geometry

Frank Nielsen,
Ke Sun

Abstract: Clustering categorical distributions in the probability simplex is a fundamental task met in many applications dealing with normalized histograms. Traditionally, the differential-geometric structures of the probability simplex have been used either by (i) setting the Riemannian metric tensor to the Fisher information matrix of the categorical distributions, or (ii) defining the dualistic information-geometric structure induced by a smooth dissimilarity measure, the Kullback-Leibler divergence. In this work, we… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 63 publications
0
4
0
Order By: Relevance
“…This has the practical-useful consequence that a lot of classical multivariate statistical theory can easily be applied in CoDA. But other geometries in the simplex have been studied, see for instance Nielsen and Sun (2017), which to our knowledge have not been applied in CoDA; TSVD falls in this category.…”
Section: Aitchison's Principle Of Subcompositional Coherencementioning
confidence: 99%
“…This has the practical-useful consequence that a lot of classical multivariate statistical theory can easily be applied in CoDA. But other geometries in the simplex have been studied, see for instance Nielsen and Sun (2017), which to our knowledge have not been applied in CoDA; TSVD falls in this category.…”
Section: Aitchison's Principle Of Subcompositional Coherencementioning
confidence: 99%
“…For example, see the adaptation of K-means clustering to several similarity measures considered by Stanitsas et al (2017), which include the affine-invariant Riemannian (AIRM) and log-Euclidean metrics. See also Nielsen and Sun (2017) for an account of clustering in Hilbert simplex geometry. In this work, we consider K-means clustering with the Thompson distance d ∞ as the similarity measure, and employ the IMR to calculate centroids for clusters.…”
Section: Midrange Clustering On Matrix Datamentioning
confidence: 99%
“…In the experiment we considered = 1e − 5. The geodesic induced by the Fisher metric is, d(y, y ) = arccos m i=1 y i y i [26]. This geodesic comes from applying the map π : ∆ m → S m−1 , π(y) = ( √ y 1 , .…”
Section: Multilabel Classification On the Statistical Manifoldmentioning
confidence: 99%