Estimating functions of probability distributions from a finite set of samples

Wolpert, David H.; Wolf, David R.

doi:10.1103/physreve.52.6841

Cited by 157 publications

(208 citation statements)

References 15 publications

Supporting

Mentioning

202

Contrasting

Unclassified

Order By: Relevance

“…A better strategy seems to consist in using Eq. 36. But notice that also this can lead to systematic errors in either direction, if bad prior estimates are used.…”

Section: ͑8͒mentioning

confidence: 99%

Entropy estimation of symbol sequences

Schürmann

Grassberger

1996

Chaos: An Interdisciplinary Journal of Nonlinear Science

262

299

View full text Add to dashboard Cite

We discuss algorithms for estimating the Shannon entropy h of finite symbol sequences with long range correlations. In particular, we consider algorithms which estimate h from the code lengths produced by some compression algorithm. Our interest is in describing their convergence with sequence length, assuming no limits for the space and time complexities of the compression algorithms. A scaling law is proposed for extrapolation from finite sample lengths. This is applied to sequences of dynamical systems in non-trivial chaotic regimes, a 1-D cellular automaton, and to written English texts.Partially random chains of symbols s 1 ,s 2 ,s 3 , . . . drawn from some finite alphabet "we restrict ourselves here to finite alphabets though most of our considerations would also apply to countable ones… appear in practically all sciences. Examples include spins in one-dimensional magnets, written texts, DNA sequences, geological records of the orientation of the magnetic field of the earth, and bits in the storage and transmission of digital data. An interesting question in all these contexts is to what degree these sequences can be ''compressed'' without losing any information. This question was first posed by Shannon 1 in a probabilistic context. He showed that the relevant quantity is the entropy "or average information content… h, which in the case of magnets coincides with the thermodynamic entropy of the spin degrees of freedom. Estimating the entropy is non-trivial in the presence of complex and long range correlations. In that case one has essentially to understand perfectly these correlations for optimal compression and entropy estimation, and thus estimates of h measure also the degree to which the structure of the sequence is understood.

show abstract

“…A better strategy seems to consist in using Eq. 36. But notice that also this can lead to systematic errors in either direction, if bad prior estimates are used.…”

Section: ͑8͒mentioning

confidence: 99%

Entropy estimation of symbol sequences

Schürmann

Grassberger

1996

Chaos: An Interdisciplinary Journal of Nonlinear Science

262

299

View full text Add to dashboard Cite

show abstract

“…5). From these distributions, entropy was calculated via seven different estimation techniques that vary in degree of bias, standard deviation and computational complexity: the classical direct technique (Shannon & Weaver, 1949), Ma lower bound (Ma, 1981), best upper bound (Paninski, 2003), Treves-Panzeri-Miller-Carlton (Treves and Panzeri, 1995;Miller, 1955;Carlton, 1969), Jackknife (Efron & Tibshirani, 1993), Wolpert-Wolf (Wolpert & Wolf, 1994;Wolpert & Wolf, 1995) and Chao-Shen (Chao & Shen, 2003). As an example, with the classic estimation technique we found the direct entropy estimate H Dir as:…”

Section: Entropy Estimationmentioning

confidence: 99%

Probability distributions of the logarithm of inter-spike intervals yield accurate entropy estimates from small datasets

Dorval

2008

Journal of Neuroscience Methods

View full text Add to dashboard Cite

The maximal information that the spike train of any neuron can pass on to subsequent neurons can be quantified as the neuronal firing pattern entropy. Difficulties associated with estimating entropy from small datasets have proven an obstacle to the widespread reporting of firing pattern entropies and more generally, the use of information theory within the neuroscience community. In the most accessible class of entropy estimation techniques, spike trains are partitioned linearly in time and entropy is estimated from the probability distribution of firing patterns within a partition. Ample previous work has focused on various techniques to minimize the finite dataset bias and standard deviation of entropy estimates from under-sampled probability distributions on spike timing events partitioned linearly in time. In this manuscript we present evidence that all distribution-based techniques would benefit from inter-spike intervals being partitioned in logarithmic time. We show that with logarithmic partitioning, firing rate changes become independent of firing pattern entropy. We delineate the entire entropy estimation process with two example neuronal models, demonstrating the robust improvements in bias and standard deviation that the logarithmic time method yields over two widely used linearly partitioned time approaches.

show abstract

“…Knuth [22] proposed a Bayesian approach, implemented in Matlab and Python and known as the Knuth method, to estimate the probability distributions using a piecewise constant model incorporating the optimal bin-width estimated from data. Wolpert and Wolf [23] provided a successful Bayesian approach to estimate the mean and the variance of entropy from data. Nemenman et al [24] utilized a mixture of Dirichlet distributions-based prior in their Bayesian Nemenman, Shafee, and Bialek (NSB) entropy estimator.…”

Section: Introductionmentioning

confidence: 99%

A Recipe for the Estimation of Information Flow in a Dynamical System

Gençağa

Knuth

Rossow

2015

Entropy

View full text Add to dashboard Cite

Information-theoretic quantities, such as entropy and mutual information (MI), can be used to quantify the amount of information needed to describe a dataset or the information shared between two datasets. In the case of a dynamical system, the behavior of the relevant variables can be tightly coupled, such that information about one variable at a given instance in time may provide information about other variables at later instances in time. This is often viewed as a flow of information, and tracking such a flow can reveal relationships among the system variables. Since the MI is a symmetric quantity; an asymmetric quantity, called Transfer Entropy (TE), has been proposed to estimate the directionality of the coupling. However, accurate estimation of entropy-based measures is notoriously difficult. Every method has its own free tuning parameter(s) and there is no consensus on an optimal way of estimating the TE from a dataset. We propose a new methodology to estimate TE and apply a set of methods together as an accuracy cross-check to provide a reliable mathematical tool for any given data set. We demonstrate both the variability in TE estimation across techniques as well as the benefits of the proposed methodology to reliably estimate the directionality of coupling among variables.

show abstract

Estimating functions of probability distributions from a finite set of samples

Cited by 157 publications

References 15 publications

Entropy estimation of symbol sequences

Entropy estimation of symbol sequences

Probability distributions of the logarithm of inter-spike intervals yield accurate entropy estimates from small datasets

A Recipe for the Estimation of Information Flow in a Dynamical System

Contact Info

Product

Resources

About