2004
DOI: 10.1109/tsp.2004.831130
|View full text |Cite
|
Sign up to set email alerts
|

Geodesic Entropic Graphs for Dimension and Entropy Estimation in Manifold Learning

Abstract: Abstract-In the manifold learning problem, one seeks to discover a smooth low dimensional surface, i.e., a manifold embedded in a higher dimensional linear vector space, based on a set of measured sample points on the surface. In this paper, we consider the closely related problem of estimating the manifold's intrinsic dimension and the intrinsic entropy of the sample points. Specifically, we view the sample points as realizations of an unknown multivariate density supported on an unknown smooth manifold. We i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
165
0
1

Year Published

2005
2005
2017
2017

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 234 publications
(169 citation statements)
references
References 29 publications
3
165
0
1
Order By: Relevance
“…It has been attempted to localize PCA to small neighborhoods [39,40,41,42], without much success [43], at least compared to what we may call volume-based methods [44,45,46,47,48,12,13,49,50,51,52,53,54], which we discuss at length in Section 7. These methods, roughly speaking, are based on empirical estimates of the volume of M ∩ B z (r), for z ∈ M and r > 0: such volume grows like r k when M has dimension k, and k is estimated by fitting the empirical volume estimates for different values of r. We expect such methods, at least when naively implemented, to both require a number of samples exponential in k (if O(1) samples exist in M ∩ B z (r 0 ), for some r 0 > 0, these algorithms require O(2 k ) points in M ∩ B z (2r 0 )), and to be highly sensitive to noise, which affects the density in high dimensions.…”
Section: Manifolds Local Pca and Intrinsic Dimension Estimationmentioning
confidence: 99%
See 2 more Smart Citations
“…It has been attempted to localize PCA to small neighborhoods [39,40,41,42], without much success [43], at least compared to what we may call volume-based methods [44,45,46,47,48,12,13,49,50,51,52,53,54], which we discuss at length in Section 7. These methods, roughly speaking, are based on empirical estimates of the volume of M ∩ B z (r), for z ∈ M and r > 0: such volume grows like r k when M has dimension k, and k is estimated by fitting the empirical volume estimates for different values of r. We expect such methods, at least when naively implemented, to both require a number of samples exponential in k (if O(1) samples exist in M ∩ B z (r 0 ), for some r 0 > 0, these algorithms require O(2 k ) points in M ∩ B z (2r 0 )), and to be highly sensitive to noise, which affects the density in high dimensions.…”
Section: Manifolds Local Pca and Intrinsic Dimension Estimationmentioning
confidence: 99%
“…For each combination of these parameters we generate 5 realizations of the data set and report the most frequent (integral) dimension returned by the set of algorithms specified below, as well as the standard deviation of such estimated dimension. We test the following algorithms, which include volume-based methods, TSP-based methods, and state-of-art Bayesian techniques: "Debiasing" [47], "Smoothing" [46] and RPMM in [61], "MLE" [62], "kNN" [63], "SmoothKNN" [64], "IDE", "TakEst", "CorrDim" [51], "MFA" [65], "MFA2" [66]. It is difficult to make a fair comparison, as several of these algorithms have one or more parameters, and the choice of such parameters is in general not obvious.…”
Section: Manifoldsmentioning
confidence: 99%
See 1 more Smart Citation
“…Rényi entropies are arguably the best known of these, with several applications (e.g., [9], [10]). The Rényi and Shannon entropies are both additive: the joint entropy of independent variables is the sum of the individual entropies.…”
Section: Introductionmentioning
confidence: 99%
“…In contrast to model order selection methods such as MDL, AIC, or BIC (see [1]), we consider non-parametric methods of dimension estimation. When the intrinsic dimension is assumed constant over the data set, several algorithms [2][3][4][5] have been proposed to estimate the dimensionality of the manifold. In several problems of practical interest, however, data will exhibit varying dimensionality.…”
Section: Introductionmentioning
confidence: 99%