The Mutual Information (MI) is an often used measure of dependency between two random variables utilized in information theory, statistics and machine learning. Recently several MI estimators have been proposed that can achieve parametric MSE convergence rate. However, most of the previously proposed estimators have high computational complexity of at least O(N 2 ). We propose a unified method for empirical non-parametric estimation of general MI function between random vectors in R d based on N i.i.d. samples. The reduced complexity MI estimator, called the ensemble dependency graph estimator (EDGE), combines randomized locality sensitive hashing (LSH), dependency graphs, and ensemble bias-reduction methods. We prove that EDGE achieves optimal computational complexity O(N ), and can achieve the optimal parametric MSE rate of O(1/N ) if the density is d times differentiable. To the best of our knowledge EDGE is the first non-parametric MI estimator that can achieve parametric MSE rates with linear time complexity. We illustrate the utility of EDGE for the analysis of the information plane (IP) in deep learning. Using EDGE we shed light on a controversy on whether or not the compression property of information bottleneck (IB) in fact holds for ReLu and other rectification functions in deep neural networks (DNN).Recently, Shwartz-Ziv and Tishby utilized MI to study the training process in Deep Neural Networks (DNN) [16]. Let X, T and Y respectively denote the input, hidden and output layers. The authors of [16] introduced the information bottleneck (IB) that represents the tradeoff between two mutual information measures: I(X, T ) and I(T, Y ). They observed that the training process of a DNN consists of two distinct phases; 1) an initial fitting phase in which I(T, Y ) increases, and 2) a subsequent compression phase in which I(X, T ) decreases. Saxe et al in [17] countered the claim of [16], asserting that this compression property is not universal, rather it depends on the specific activation function. Specifically, they claimed that the compression property does not hold for ReLu activation functions. The authors of [16] challenged these claims, arguing that the authors of [17] had not observed compression due to poor estimates of the MI. We use our proposed rate-optimal ensemble MI estimator to explore this arXiv:1801.09125v2 [cs.IT]
We propose a direct estimation method for Rényi and f-divergence measures based on a new graph theoretical interpretation. Suppose that we are given two sample sets X and Y , respectively with N and M samples, where η := M/N is a constant value. Considering the k-nearest neighbor (k-NN) graph of Y in the joint data set (X, Y ), we show that the average powered ratio of the number of X points to the number of Y points among all k-NN points is proportional to Rényi divergence of X and Y densities. A similar method can also be used to estimate f-divergence measures. We derive bias and variance rates, and show that for the class of γ-Hölder smooth functions, the estimator achieves the MSE rate of O N −2γ/(γ+d) . Furthermore, by using a weighted ensemble estimation technique, for density functions with continuous and bounded derivatives of up to the order d, and some extra conditions at the support set boundary, we derive an ensemble estimator that achieves the parametric MSE rate of O(1/N ). Our estimator requires no boundary correction, and remarkably, the boundary issues do not show up. Our approach is also more computationally tractable than other competing estimators, which makes them appealing in many practical applications.
Visible light communications (VLC) in indoor environments suffer from the limited bandwidth of LEDs as well as from the inter-symbol interference (ISI) imposed by multipath. In this work, transmission schemes to improve the performance of indoor optical wireless communication (OWC) systems are introduced. Expurgated pulse-position modulation (EPPM) is proposed for this application since it can provide a wide range of peak to average power ratios (PAPR) needed for dimming of the indoor illumination. A correlation decoder used at the receiver is shown to be optimal for indoor VLC systems, which are shot noise and background-light limited. Interleaving applied on EPPM in order to decrease the ISI effect in dispersive VLC channels can significantly decrease the error probability. The proposed interleaving technique makes EPPM a better modulation option compared to PPM for VLC systems or any other dispersive OWC system. An overlapped EPPM pulse technique is proposed to increase the transmission rate when bandwidth-limited white LEDs are used as sources.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.