Mixtures of high dimensional Gaussian distributions have been studied extensively in statistics and learning theory. While the total variation distance appears naturally in the sample complexity of distribution learning, it is analytically difficult to obtain tight lower bounds for mixtures. Exploiting a connection between total variation distance and the characteristic function of the mixture, we provide fairly tight functional approximations. This enables us to derive new lower bounds on the total variation distance between pairs of two-component Gaussian mixtures that have a shared covariance matrix.
IntroductionLet N (µ, Σ) denote the d-dimensional Gaussian distribution with mean µ ∈ R d and positive definite covariance matrix, where w i ∈ R + with k i=1 w i = 1 are the mixing weights, µ i ∈ R d are the means, and Σ i ∈ R d×d are the covariance matrices. Mixtures of Gaussian distributions have been studied intensively due to broad applicability to statistical problems [2,10,11,21,22,28,29,31,32].The variational distance (a.k.a., the total variation (TV) distance) between two distributions f, f with same sample space Ω and sigma algebra S is defined as follows: