We study error exponents for source coding with side information. Both achievable exponents and converse bounds are obtained for the following two cases: lossless source coding with coded information (SCCSI) and lossy source coding with full side information (Wyner-Ziv). These results recover and extend several existing results on source-coding error exponents and are tight in some circumstances. Our bounds have a natural interpretation as a two-player game between nature and the code designer, with nature seeking to minimize the exponent and the code designer seeking to maximize it. In the Wyner-Ziv problem our analysis exposes a tension in the choice of test channel with the optimal test channel balancing two competing error events. The Gaussian and binary-erasure cases are examined in detail. I. INTRODUCTIONIn a typical lossy data compression problem a source is to be compressed by an encoder at a prescribed rate so that a decoder may reproduce the source to within some desired fidelity (distortion). Sometimes present, in addition to the data to be compressed, is some correlated information that can be utilized by a second encoder, that is able to send a separate message to the decoder. We refer to this kind of problem as source coding with side information (SCSI). The set-up is depicted in Fig. 1, where a source X is compressed by encoder one to a rate R 1 with the decoder having access to encoded side information Y , compressed at rate R 2 by encoder two, as well as the compressed version of X from the first encoder.The SCSI scenario arises in a variety of applications. For example, in video applications [1] X can represent a current frame, and Y a separate correlated frame sent from a second encoder. Y can even represent the frame(s) preceding the current frame X in the stream: while the previous frames are certainly available to the encoder, the encoder's coding scheme can be simplified by not making use of this information and leaving the decoder to exploit the interframe dependence. A second example can be found in communication in networks with relays [2]. A source sends a message X to a sink in a network containing a relay. One mode of operation for the relay is "compress and forward", i.e. for the relay to send a compressed version of its observation, Y , of the source-sink message to the sink. This compressed message can be used by the sink to further aid its decoding. SCSI appears in applications even beyond communication, for example (with minor changes) it has been proposed as a model for rate-constrained pattern recognition [3].For the lossless problem with coded side information (SCCSI) 1 , and the lossy problem with full side information (Wyner-Ziv), the "rate region" problem, i.e. determining the rates required to meet a given average distortion constraint, is solved. In this paper, we study these two problems from an error-exponent standpoint. Our motivation for doing so is three-fold:
Given training sequences generated by two distinct, but unknown, distributions sharing a common alphabet, we study the problem of determining whether a third test sequence was generated according to the first or second distribution using only the training data. To better model sources such as natural language, for which the underlying distributions are difficult to learn, we allow the alphabet size to grow and therefore the probability distributions to change with the blocklength. Our primary focus is the situation in which the underlying probabilities are all of the same order, and in this regime we give conditions on the alphabet growth rate and distributions guaranteeing the existence of universally consistent tests, i.e. tests having a probability of error tending to zero with the blocklength for any underlying distributions. We show that some commonly used statistical tests are universally consistent provided the alphabet is sub-linear but these tests are inconsistent for linear growth rates. We then propose a classifier that is universally consistent with up-to quadratic alphabet growth and that no classifier can handle the case in which the alphabet grows quadratically or faster. If the tester is given the underlying distributions in place of the training data, we prove that consistent testing is possible regardless of the growth of the underlying alphabet. Our results are then used to illuminate the problem of classifying arbitrary (i.e. non-homogeneous) distributions on growing alphabets.
We provide a novel upper-bound on Witsenhausen's rate, the rate required in the zero-error analogue of the Slepian-Wolf problem; our bound is given in terms of a new information-theoretic functional defined on a certain graph. We then use the functional to give a single letter lower-bound on the error exponent for the Slepian-Wolf problem under the vanishing error probability criterion, where the decoder has full (i.e. unencoded) side information. Our exponent stems from our new encoding scheme which makes use of source distribution only through the positions of the zeros in the 'channel' matrix connecting the source with the side information, and in this sense is 'semi-universal'. We demonstrate that our error exponent can beat the 'expurgated' source-coding exponent of Csiszár and Körner, achievability of which requires the use of a non-universal maximum-likelihood decoder. An extension of our scheme to the lossy case (i.e. Wyner-Ziv) is given. For the case when the side information is a deterministic function of the source, the exponent of our improved scheme agrees with the sphere-packing bound exactly (thus determining the reliability function). An application of our functional to zero-error channel capacity is also given.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.