A basic combinatorial interpretation of Shannon's entropy function is via the "20 questions" game. This cooperative game is played by two players, Alice and Bob: Alice picks a distribution π over the numbers {1, . . . , n}, and announces it to Bob. She then chooses a number x according to π, and Bob attempts to identify x using as few Yes/No queries as possible, on average. An optimal strategy for the "20 questions" game is given by a Huffman code for π: Bob's questions reveal the codeword for x bit by bit. This strategy finds x using fewer than H(π) + 1 questions on average. However, the questions asked by Bob could be arbitrary. In this paper, we investigate the following question: Are there restricted sets of questions that match the performance of Huffman codes, either exactly or approximately?Our first main result shows that for every distribution π, Bob has a strategy that uses only questions of the form "x < c?" and "x = c?", and uncovers x using at most H(π) + 1 questions on average, matching the performance of Huffman codes in this sense. We also give a natural set of O(rn 1/r ) questions that achieve a performance of at most H(π) + r, and show that Ω(rn 1/r ) questions are required to achieve such a guarantee.Our second main result gives a set Q of 1.25 n+o(n) questions such that for every distribution π, Bob can implement an optimal strategy for π using only questions from Q. We also show that 1.25 n−o(n) questions are needed, for infinitely many n. If we allow a small slack of r over the optimal strategy, then roughly (rn) Θ(1/r) questions are necessary and sufficient.We summarize this with the following meta-question, which guides this work: Are there "nice" sets of queries Q such that for any distribution, there is a "high quality" strategy that uses only queries from Q?Formalizing this question depends on how "nice" and "high quality" are quantified. We consider two different benchmarks for sets of queries:1. An information-theoretical benchmark: A set of queries Q has redundancy r if for every distribution π there is a strategy using only queries from Q that finds x with at most H(π)+r queries on average when x is drawn according to π.
A combinatorial benchmark:A set of queries Q is r-optimal (or has prolixity r) if for every distribution π there is a strategy using queries from Q that finds x with at most Opt(π) + r queries on average when x is drawn according to π, where Opt(π) is the expected number of queries asked by an optimal strategy for π (e.g. a Huffman tree).Given a certain redundancy or prolixity, we will be interested in sets of questions achieving that performance that (i) are as small as possible, and (ii) allow efficient construction of high quality strategies which achieve the target performance. In some cases we will settle for only one of these properties, and leave the other as an open question.Information-theoretical benchmark. Let π be a distribution over X. A basic result in information theory is that every algorithm that reveals an unknown element x drawn according to π (in...