2011 IEEE International Symposium on Information Theory Proceedings 2011
DOI: 10.1109/isit.2011.6033726
|View full text |Cite
|
Sign up to set email alerts
|

k-nearest neighbor estimation of entropies with confidence

Abstract: Abstract-We analyze a k-nearest neighbor (k-NN) class of plug-in estimators for estimating Shannon entropy and Rényi entropy. Based on the statistical properties of k-NN balls, we derive explicit rates for the bias and variance of these plug-in estimators in terms of the sample size, the dimension of the samples and the underlying probability distribution. In addition, we establish a central limit theorem for the plug-in estimator that allows us to specify confidence intervals on the entropy functionals. As an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(8 citation statements)
references
References 21 publications
0
8
0
Order By: Relevance
“…There are many different approaches for estimating information measures, including kernel based methods, nearest neighbor methods, methods based on sample distances as well as multiple variants of plug-in estimates. Many estimators have been shown to be consistent and/or asymptotically unbiased under various constraints, e. g., in [17,1,12,21,36]. An excellent overview can be found in [6].…”
Section: Previous Workmentioning
confidence: 99%
“…There are many different approaches for estimating information measures, including kernel based methods, nearest neighbor methods, methods based on sample distances as well as multiple variants of plug-in estimates. Many estimators have been shown to be consistent and/or asymptotically unbiased under various constraints, e. g., in [17,1,12,21,36]. An excellent overview can be found in [6].…”
Section: Previous Workmentioning
confidence: 99%
“…Seo et al in [41] proposed a data anomaly detection method for micro-clustering, and designed a method for detecting and specifying outliers using a local outlier as a center of a micro-cluster in offline components. Sricharan et al in [42] defined the outliers of data in WSN, and proposed a method based on data classification to estimate and calculate by probability density function. This method has been proven to be applicable to different types of data testing, including Gaussian distribution.…”
Section: Related Workmentioning
confidence: 99%
“…In that case, the empirical relative frequencies in these 100 bins are no longer reasonable estimates of the true joint probabilities. How to estimate the entropy under these circumstances has been studied intensely in recent years (Paninski, 2003; Nemenman et al, 2004; Paninski, 2004; Shwartz et al, 2005; Ho et al, 2010; Sricharan et al, 2011). …”
Section: An Information-theoretic Definition Of Temporal Contingencymentioning
confidence: 99%