2017
DOI: 10.1002/sta4.167
|View full text |Cite
|
Sign up to set email alerts
|

Bump hunting by topological data analysis

Abstract: A topological data analysis approach is taken to the challenging problem of finding and validating the statistical significance of local modes in a data set. As with the SIgnificance of the ZERo (SiZer) approach to this problem, statistical inference is performed in a multi-scale way, that is, across bandwidths. The key contribution is a twoparameter approach to the persistent homology representation. For each kernel bandwidth, a sub-level set filtration of the resulting kernel density estimate is computed. In… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 37 publications
0
8
0
Order By: Relevance
“…However, the deviation of each data point from a cluster centroid is measured instead of variance. -Various methods for density modes (equivalently zero-dimensional density ridges or degree zero homological features) have been recently proposed (Chacón, 2020;Chaudhuri & Marron, 1999;Chazal et al, 2018;Chen et al, 2016;Comaniciu & Meer, 2002;Fasy et al, 2014;Genovese et al, 2014;Genovese et al, 2016;Sommerfeld et al, 2017;Zhang & Ghanem, 2021).…”
Section: Deviation) [Dimensionless]mentioning
confidence: 99%
“…However, the deviation of each data point from a cluster centroid is measured instead of variance. -Various methods for density modes (equivalently zero-dimensional density ridges or degree zero homological features) have been recently proposed (Chacón, 2020;Chaudhuri & Marron, 1999;Chazal et al, 2018;Chen et al, 2016;Comaniciu & Meer, 2002;Fasy et al, 2014;Genovese et al, 2014;Genovese et al, 2016;Sommerfeld et al, 2017;Zhang & Ghanem, 2021).…”
Section: Deviation) [Dimensionless]mentioning
confidence: 99%
“…For this single‐bandwidth case, Fasy et al () showed that it is possible to construct a band above the diagonal to identify the significant modes, in the sense that all the modes with death–birth pairs lying on this low persistence band can be considered to be caused by random fluctuation. On the other hand, Sommerfeld et al () suggested to inspect all the persistence diagrams as the bandwidth varies to evaluate the significance of the modes across different degrees of smoothing. They showed through simulated and real data examples that such an approach appears to be more powerful than SiZer.…”
Section: Many Modesmentioning
confidence: 99%
“…In contrast, the remaining three modes appear to be really there, and besides, they have death-birth levels which are very similar to those of the true density modes. For this single-bandwidth case, Fasy et al (2014) showed that it is possible to construct a band above the diagonal to identify the significant modes; on the other hand, Sommerfeld et al (2017) suggested to inspect all the persistence diagrams as the bandwidth varies to evaluate the significance of the modes across different degrees of smoothing.…”
Section: Many Modesmentioning
confidence: 99%
“…In turn, modes have paved the way to introduce related concepts in data structures such as bumps, components, clusters, or classes, among others. However, chasing modes has proven to be extremely difficult in large and even not‐so‐large dimensions, so huge amounts of statistical research are devoted to improve already existing methods or developing new ones to allow a more efficient way to tackle the problem . In this paper, we are primarily concerned with the particular case of mode hunting.…”
Section: Introductionmentioning
confidence: 99%
“…However, chasing modes has proven to be extremely difficult in large and even not-so-large dimensions, so huge amounts of statistical research are devoted to improve already existing methods or developing new ones to allow a more efficient way to tackle the problem. [1][2][3][4] In this paper, we are primarily concerned with the particular case of mode hunting. Thus, we propose here Active Information Mode Hunting (AIMH), a multivariate algorithm to chase modes based on information theory.…”
Section: Introductionmentioning
confidence: 99%