2004
DOI: 10.1109/tit.2004.830761
|View full text |Cite
|
Sign up to set email alerts
|

Universal Compression of Memoryless Sources Over Unknown Alphabets

Abstract: For a collection of distributions over a countable support set, the worst case universal compression formulation by Shtarkov attempts to assign a universal distribution over the support set. The formulation aims to ensure that the universal distribution does not underestimate the probability of any element in the support set relative to distributions in the collection. When the alphabet is uncountable and we have a collection P of Lebesgue continuous measures instead, we ask if there is a corresponding univers… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

6
161
0

Year Published

2006
2006
2020
2020

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 115 publications
(167 citation statements)
references
References 36 publications
6
161
0
Order By: Relevance
“…This establishes (1). Since nP n (X n ) converges in distribution to Y , we can create a sequence of random variables {Y n } ∞ n=1…”
Section: Total Probability Convergencementioning
confidence: 84%
“…This establishes (1). Since nP n (X n ) converges in distribution to Y , we can create a sequence of random variables {Y n } ∞ n=1…”
Section: Total Probability Convergencementioning
confidence: 84%
“…He noted that his result implies that the class of stationary and ergodic sources over an infinite alphabet is not universally compressible, in contrast to the finite-alphabet case. The impetus for the present paper came from the more-recent work of Orlitsky et al [11], [12]. These authors called attention to the discrepancy between the asymptotic that is typically used in information theory and many realistic data sources, and showed that a function of the observed sequence called the pattern can be universally compressed, even when the alphabet is infinite and the source distribution is arbitrary.…”
Section: Connections To the Literaturementioning
confidence: 99%
“…It was first introduced byÅberg in [8] as a solution to the multi-alphabet coding problem, where the message x contains only a small subset of the known alphabet A. It was further studied and motivated in a series of articles by Shamir [9][10][11][12] and by Jevtić, Orlitsky, Santhanam and Zhang [13][14][15][16] for practical applications: the alphabet is unknown and has to be transmitted separately anyway (for instance, transmission of a text in an unknown language), or the alphabet is very large in comparison to the message (consider the case of images with k = 2 24 colors, or texts when taking words as the alphabet units).…”
Section: Dictionary and Patternmentioning
confidence: 99%
“…Using the same notations as in [15], we call multiplicity µ j (ψ) of symbol j in pattern ψ ∈ P n the number of occurrences of j in ψ; the multiplicity of pattern ψ is the vector made of all symbol's multiplicities:…”
Section: Dictionary and Patternmentioning
confidence: 99%
See 1 more Smart Citation