“…This yields the well known bound, for which the cost of each unknown probability parameter is 0.5 log n bits. Recently, we showed [30], [33] that in the case of large alphabets, the simple grid used to achieve the fixed k bound is not sufficient. In the minimax case, a non-uniform grid with increasing spacing in each dimension was created, and resulted in a cost of 0.5 log(n/k) bits for each unknown probability parameter.…”
Section: Average Case -Backgroundmentioning
confidence: 99%
“…In [30], [32]- [33], it was established that for a large known alphabet of size k, choosing a set Ω of M sources θ whose k − 1 free components are placed only at points on a non-uniform grid of increased spacing in each dimension yields a set of distinguishable sources if the grid spacing is properly chosen. The k − 1 components of grid points take values only from the grid vector…”
Section: A Maximin and Minimax Lower Boundmentioning
confidence: 99%
“…case (see, e.g., [30]). In particular, we must include A k in the error event, although we can use the assumption that θ k ≥ θ i , for all i; 1 ≤ i ≤ k − 1.…”
Section: Appendix B -Proof Of Lemma 52mentioning
confidence: 99%
“…The upper bounds are obtained by a constructive approach. For small k's it combines Rissanen's approach [24] with our recent approach from [30], [33] and with the more demanding conditions in coding patterns.…”
Universal compression of patterns of sequences generated by independently identically distributed (i.i.d.) sources with unknown, possibly large, alphabets is investigated. A pattern is a sequence of indices that contains all consecutive indices in increasing order of first occurrence.
“…This yields the well known bound, for which the cost of each unknown probability parameter is 0.5 log n bits. Recently, we showed [30], [33] that in the case of large alphabets, the simple grid used to achieve the fixed k bound is not sufficient. In the minimax case, a non-uniform grid with increasing spacing in each dimension was created, and resulted in a cost of 0.5 log(n/k) bits for each unknown probability parameter.…”
Section: Average Case -Backgroundmentioning
confidence: 99%
“…In [30], [32]- [33], it was established that for a large known alphabet of size k, choosing a set Ω of M sources θ whose k − 1 free components are placed only at points on a non-uniform grid of increased spacing in each dimension yields a set of distinguishable sources if the grid spacing is properly chosen. The k − 1 components of grid points take values only from the grid vector…”
Section: A Maximin and Minimax Lower Boundmentioning
confidence: 99%
“…case (see, e.g., [30]). In particular, we must include A k in the error event, although we can use the assumption that θ k ≥ θ i , for all i; 1 ≤ i ≤ k − 1.…”
Section: Appendix B -Proof Of Lemma 52mentioning
confidence: 99%
“…The upper bounds are obtained by a constructive approach. For small k's it combines Rissanen's approach [24] with our recent approach from [30], [33] and with the more demanding conditions in coding patterns.…”
Universal compression of patterns of sequences generated by independently identically distributed (i.i.d.) sources with unknown, possibly large, alphabets is investigated. A pattern is a sequence of indices that contains all consecutive indices in increasing order of first occurrence.
“…It was first introduced byÅberg in [8] as a solution to the multi-alphabet coding problem, where the message x contains only a small subset of the known alphabet A. It was further studied and motivated in a series of articles by Shamir [9][10][11][12] and by Jevtić, Orlitsky, Santhanam and Zhang [13][14][15][16] for practical applications: the alphabet is unknown and has to be transmitted separately anyway (for instance, transmission of a text in an unknown language), or the alphabet is very large in comparison to the message (consider the case of images with k = 2 24 colors, or texts when taking words as the alphabet units).…”
We show that the maximin average redundancy in pattern coding is eventually larger than 1.84for messages of length n. This improves recent results on pattern redundancy, although it does not fill the gap between known lower-and upper-bounds. The pattern of a string is obtained by replacing each symbol by the index of its first occurrence. The problem of pattern coding is of interest because strongly universal codes have been proved to exist for patterns while universal message coding is impossible for memoryless sources on an infinite alphabet. The proof uses fine combinatorial results on partitions with small summands.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.