In this paper, we consider the so-called uniquely decodable one-to-one code (UDOOC) that is formed by inserting a "comma" indicator, termed the unique word (UW), between consecutive one-to-one codewords for separation. Along this research direction, we first investigate several general combinatorial properties of UDOOCs, in particular the enumeration of the number of UDOOC codewords for any (finite) codeword length. Based on the obtained formula on the number of length-n codewords for a given UW, the per-letter average codeword length of UDOOC for the optimal compression of a given source statistics can be computed. Several upper bounds on the average codeword length of such UDOOCs are next established. The analysis on the bounds of average codeword length then leads to two asymptotic bounds for sources having infinitely many alphabets, one of which is achievable and hence tight for a certain source statistics and UW, and the other of which proves the achievability of source entropy rate of UDOOCs when both the block size of source letters for UDOOC compression and UW length go to infinity. Efficient encoding and decoding algorithms for UDOOCs are also given in this paper. Numerical results show that when grouping three English letters as a block, the UDOOCs with UW = 0001, 0000, 000001 and 000000 can respectively reach the compression rates of 3.531, 4.089, 4.115, 4.709 bits per English letter (with the lengths of UWs included), where the source stream to be compressed is the book titled Alice's Adventures in Wonderland. In comparison with the first-order Huffman code, the second-order Huffman code, the third-order Huffman code and the Lempel-Ziv code, which respectively achieve the compression rates of 3.940, 3.585, 3.226 and 6.028 bits per single English letter, the proposed UDOOCs can potentially result in comparable compression rate to the Huffman code under similar decoding complexity and yield a smaller average codeword length than that of the Lempel-Ziv code, thereby confirming the practicability of UDOOCs.The authors are with the
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.